On Wed Oct 14, 2020 at 4:19 PM -03, Jonathan Corbet wrote: > > On Wed, 14 Oct 2020 11:56:44 +0200 > Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> wrote: > > > > To make the first step possible, disable the parallel_read_safe option > > > in Sphinx, since the dictionary that maps the files to the C namespaces > > > can't be concurrently updated. This unfortunately increases the build > > > time of the documentation. > > > > Disabling parallel_read_safe will make performance very poor. > > Doesn't the C domain store the current namespace somewhere? > > If so, then, instead of using the source-read phase, something > > else could be used instead. The issue is that C domain parsing happens at an earlier phase in the Sphinx process, and the current stack containing the C namespace is long gone when we get to do the automatic cross-referencing at the doctree-resolved phase. Not only that, but the namespace isn't assigned to the file it's in, and vice-versa, because Sphinx's interest is in assigning a C directive it is currently reading to the current namespace, so there isn't any point in saving which namespaces appeared at a given file. That is exactly what we want, but Sphinx doesn't have that information. For instance, printing all symbols from app.env.domaindata['c']['root_symbol'] shows every single C namespace, but the docname field in each of them is None. That's why the way to go is to assign the namespaces to the files at the source-read phase on our own. > That seems like the best solution if it exists, yes. Otherwise a simple > lock could be used around c_namespace to serialize access there, right? Actually I was wrong when I said that the issue was that "they can't be concurrently updated". When parallel_read_safe is enabled, Sphinx spawns multiple processes rather than multiple threads, to get true concurrency by sidestepping python's GIL. So the same c_namespace variable isn't even accessible across the multiple processes. Reading multiprocessing's documentation [1] it seems that memory could be shared between the processes using Value or Array, but both would need to be passed to the processes by the one who spawned them, that is, it would need to be done from Sphinx's side. So, at the moment I'm not really seeing a way to have this information be shared concurrently by the python processes but I will keep searching. Thanks, Nícolas [1] https://docs.python.org/3/library/multiprocessing.html#sharing-state-between-processes