Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1"""All context info. 

2 

3* Data 

4* Workflow 

5* User 

6* User content cache 

7""" 

8 

9from config import Config as C, Names as N 

10from control.typ.types import Types 

11from control.utils import pick as G, serverprint 

12from control.workflow.apply import WorkflowItem 

13 

14 

15CB = C.base 

16CT = C.tables 

17 

18DEBUG = CB.debug 

19DEBUG_CACHE = G(DEBUG, N.cache) 

20 

21VALUE_TABLES = set(CT.valueTables) 

22 

23 

24class Context: 

25 """Combines low-level classes and adds caching. 

26 

27 Several classes deal with database data, and they 

28 might be needed all over the place, so we combine them in a 

29 Context singleton for easy passing around. 

30 

31 The Context singleton is at the right place to realize some database caching. 

32 A few Db methods have a corresponding method here, which first checks a cache 

33 before actually calling the lower level Db method. 

34 

35 A few notes on the lifetimes of those objects and the cache. 

36 

37 Before the Flask object is constructed, the factory reads data from config files 

38 and MongoDb and stores it in data structures which become bound to the Flask 

39 object. 

40 

41 !!! caution "Python data lives per worker" 

42 All data bound to the Flask app is per worker. 

43 The webserver may spawn several processes, and they all get a copy 

44 of this data (because of `gunicorn --preload`, but after that, each 

45 copy is independent. 

46 

47 !!! caution "MongoDb connection" 

48 Although the Mongo connection could be constructed before the fork, 

49 copying the connection to workers is bad. 

50 So that connection will be closed after initialization. 

51 Whenever a worker needs access to MongoDb again, it will create a connection 

52 and store it, so that each worker has only a single connection to MongoDb. 

53 

54 !!! hint "Truly global data in the database" 

55 The only truly global data is data stored in the MongoDb. 

56 That is the ultimate source of truth for all workers. 

57 

58 It makes sense for workers to cache data between requests and other data 

59 just for the duration of requests. 

60 

61 Store | lifetime | what is stored 

62 --- | --- | --- 

63 MongoDb | permanent | all app tables 

64 MongoDb | permanent | the workflow table, see `control.workflow.compute.Workflow` 

65 `control.db.Db` | worker process | cache for all data in all value tables 

66 `control.auth.Auth` | request | holds current user data 

67 `control.context.Context.cache` | request | cache for some records inuser tables 

68 

69 !!! note "Why needed?" 

70 During a request, several records may be shown, with their details. 

71 They have to be fetched in order to get the permissions. 

72 Details may require the permissions of the parents. Many records may share 

73 the same workflow information. 

74 Caching prevents an explosion of record fetches. 

75 

76 However, we should not cache this databetween requests, 

77 because the records that benefit most from caching are exactly the ones 

78 that are changed frequentlyby users. 

79 

80 !!! note "Individual items" 

81 The cache stores individual record and workflow items (by table and id) 

82 straight after fetching them from mongo, via Db. 

83 

84 !!! note "versus Db caching" 

85 The records in value tables are already cached in Db itself. 

86 Such records will not go in this cache. 

87 And other workers will do the same when they need that table. 

88 But this happens very rarely. 

89 

90 !!! caution "refreshing the Db cache" 

91 Another worker may have changed a value in a value table. 

92 Then our cache of that table is invalid. 

93 We detect it by inspecting the table `collect` in MongoDb. 

94 See `control.db.Db.recollect`. 

95 """ 

96 

97 def __init__(self, db, wf, auth): 

98 """## Initialization 

99 

100 Creates a context singleton and initializes its cache. 

101 

102 This class has some methods that wrap a lower level Db data access method, 

103 to which it adds caching. 

104 

105 Parameters 

106 ---------- 

107 db: object 

108 See below. 

109 wf: object 

110 See below. 

111 auth: object 

112 See below. 

113 """ 

114 

115 self.db = db 

116 """*object* The `control.db.Db` singleton 

117 

118 Provides methods to retrieve user 

119 info from the database and store user info there. 

120 """ 

121 

122 self.wf = wf 

123 """*object* The `control.workflow.compute.Workflow` singleton 

124 

125 Provides methods to handle workflow. 

126 """ 

127 

128 self.auth = auth 

129 """*object* The `control.auth.Auth` singleton 

130 

131 Provides methods to access the attributes of the current user. 

132 """ 

133 

134 self.types = Types(self) 

135 """*object* The `control.typ.types.Types` singleton 

136 

137 Provides methods to deal with values and their types. 

138 """ 

139 

140 self.cache = {} 

141 """*dict* The cache to store items from the database. 

142 

143 The cache lives as long as the request. 

144 """ 

145 

146 db.recollect() 

147 

148 def getItem(self, table, eid, requireFresh=False): 

149 """Fetch an item from the database, possibly from cache. 

150 

151 Parameters 

152 ---------- 

153 table: string 

154 The table from which the record is fetched. 

155 eid: ObjectId 

156 (Entity) ID of the particular record. 

157 requireFresh: boolean, optional `False` 

158 If True, bypass the cache and fetch the item straight from Db and put the 

159 fetched value in the cache. 

160 

161 Returns 

162 ------- 

163 dict 

164 The record as a dict. 

165 """ 

166 

167 if not eid: 

168 return {} 

169 

170 db = self.db 

171 

172 if table in VALUE_TABLES: 

173 return db.getItem(table, eid) 

174 

175 return self.getCached( 

176 db.getItem, N.getItem, [table, eid], table, eid, requireFresh, 

177 ) 

178 

179 def refreshCache(self): 

180 """Refresh the cache. 

181 

182 All values stored in value tables will be cached. 

183 But all workers will end up with their own cache. 

184 They signal each other when to refresh their caches if they change one of these 

185 values. 

186 

187 But you can also manually trigger all workers to refresh their caches. 

188 

189 Returns 

190 ------- 

191 bool 

192 Whether the cache refreshing has been executed. 

193 """ 

194 

195 auth = self.auth 

196 db = self.db 

197 

198 done = False 

199 if auth.sysadmin(): 199 ↛ 202line 199 didn't jump to line 202, because the condition on line 199 was never false

200 db.recollect(True) 

201 done = True 

202 return done 

203 

204 def resetWorkflow(self): 

205 """Recompute the workflow table. 

206 

207 The workflow table contains only information that can be derived from the 

208 other tables. In case the workflow table appears out of sync, a system 

209 administrator can trigger a clearing of the workflow table followed by 

210 a recomputation of all workflow info. 

211 

212 Returns 

213 ------- 

214 int 

215 The number of resulting workflow records. 

216 If the recomputation did not take place, -1 is returned. 

217 """ 

218 

219 auth = self.auth 

220 wf = self.wf 

221 

222 nWf = -1 

223 if auth.sysadmin(): 

224 nWf = wf.initWorkflow(drop=False) 

225 return nWf 

226 

227 def getWorkflowItem(self, contribId, requireFresh=False): 

228 """Fetch a single workflow record from the database, possibly from cache. 

229 

230 Parameters 

231 ---------- 

232 contribId: ObjectId 

233 The id of the workflow item to be fetched. 

234 requireFresh: boolean, optional `False` 

235 If True, bypass the cache and fetch the item straight from Db and put the 

236 fetched value in the cache. 

237 

238 Returns 

239 ------- 

240 dict 

241 the record wrapped in a 

242 `control.workflow.apply.WorkflowItem` singleton 

243 """ 

244 

245 if not contribId: 

246 return None 

247 

248 db = self.db 

249 wf = self.wf 

250 

251 info = self.getCached( 

252 db.getWorkflowItem, 

253 N.getWorkflowItem, 

254 [contribId], 

255 N.workflow, 

256 contribId, 

257 requireFresh, 

258 ) 

259 if not info: 

260 info = wf.computeWorkflow(contribId=contribId) 

261 return WorkflowItem(self, info) 

262 

263 def deleteItem(self, table, eid): 

264 """Delete a record and also remove it from the cache. 

265 

266 Parameters 

267 ---------- 

268 table: string 

269 The table which holds the record to be deleted. 

270 eid: ObjectId 

271 (Entity) id of the record to be deleted. 

272 """ 

273 

274 db = self.db 

275 cache = self.cache 

276 

277 good = db.deleteItem(table, eid) 

278 if table not in VALUE_TABLES: 

279 key = eid if type(eid) is str else str(eid) 

280 if table in cache: 280 ↛ 284line 280 didn't jump to line 284, because the condition on line 280 was never false

281 cachedTable = cache[table] 

282 if key in cachedTable: 

283 del cachedTable[key] 

284 return good 

285 

286 def getCached(self, method, methodName, methodArgs, table, eid, requireFresh): 

287 """Helper to wrap caching around a raw Db fetch method. 

288 

289 Only for methods that fetch single records. 

290 

291 Parameters 

292 ---------- 

293 method: function 

294 The raw `control.db.Db` method. 

295 methodName: string 

296 The name of the raw Db method. Only used to display if cache 

297 debugging is on. 

298 methodNameArgs: iterable 

299 The arguments to pass to the Db method. 

300 table: string 

301 The table from which the record is fetched. 

302 eid: ObjectId 

303 (Entity) ID of the particular record. 

304 requireFresh: boolean, optional `False` 

305 If True, bypass the cache and fetch the item straight from Db and put the 

306 fetched value in the cache. 

307 

308 Returns 

309 ------- 

310 mixed 

311 Whatever the underlying fetch method returns or would return. 

312 """ 

313 cache = self.cache 

314 

315 key = eid if type(eid) is str else str(eid) 

316 

317 if not requireFresh: 

318 if table in cache: 

319 if key in cache[table]: 319 ↛ 324line 319 didn't jump to line 324, because the condition on line 319 was never false

320 if DEBUG_CACHE: 320 ↛ 321line 320 didn't jump to line 321, because the condition on line 320 was never true

321 serverprint(f"""CACHE HIT {methodName}({key})""") 

322 return cache[table][key] 

323 

324 result = method(*methodArgs) 

325 cache.setdefault(table, {})[key] = result 

326 return result