It really depends on what you do with the data, and how much data is stored on the page. The example I linked ran a bunch of regexes against the surrounding page which on large pages consumed around 10% CPU. My main point is that normally the module should not need to know about the context it is called from, everything it needs to perform its task is passed as parameter. That's just a good general practice in software development. If the module depends on the context then changing the page might inadvertently break the module.
In your case it is obviously not possible just to use parameters, or would be very cumbersome, and it might be the only way to do it.
Another question would also be why the components who actually "own" the data don't generate the correct categories themselves - the index page status (not-proofread etc.) is definitely already exposed, so you might be duplicating some categories. Reading and parsing the surrounding page should be the last resort.