The matrices U and V are unitary matrices of the left and right singular vectors of matrix A, and S is a diagonal matrix with singular values σi of A. Observe that the NCI index is a characteristic property of the corresponding document-entity matrix as a result of it is calculated from its singular values σi.

As a result of the number of documents adjustments each day, whereas the number of entities stays constant, all NCI indices in our analyses are normalised by dividing them by the number documents in the corpus, m. We've statistically confirmed that the NCI is significantly above the level of fluctuations of the cohesiveness random null model (see Section 2 of the Supplementary Data ).

We adopt the terminology from 9 and treat our news-primarily based indicators (NCI variants and entity incidence) as indicators of the data supply in online media, whereas volumes of Google search queries are treated as indicators of knowledge demand.

Data supply indicators: cohesiveness index primarily based on all the news from NewStream (NCI), cohesiveness index primarily based only on filtered financial news from NewStream (NCI-financial), whole entity occurrences primarily based on the combination from all news documents and whole entity occurrences primarily based on strictly financial documents from NewStream.

Deciding on financial documents also improves the correlation with other financial indices as shown in Figure 5 For more details concerning the number of financial documents and how it affects correlations with several other indices, see Section 3 of the Supplementary Data.