Return to site

Data Centralization Dynamics and Future Trends: A Probabilistic Model for Assumptions, Evolution, and Scenario Outcomes

October 9, 2024

In this technical model, we explore how data centralization impacts power dynamics and individual autonomy, and estimate the likelihood of future scenarios involving continued centralization, revolt, and the emergence of decentralized technologies to restore balance.

This graph represents the projected probabilities of three future scenarios—centralization, revolt, and balance restoration—over time, showing a gradual decrease in centralization and corresponding increases in revolt and balance restoration.

This table shows the time-based probabilities for the scenarios of centralization, revolt, and balance restoration over 20 time periods, indicating a decrease in centralization and corresponding increases in revolt and balance restoration.

This image shows Python code that simulates the probabilities of three future scenarios—centralization, revolt, and balance restoration—over 20 time periods, adjusting these probabilities based on gradual changes and ensuring they sum to 1.

Assumption Explanation and Reasoning

1. The Central Theme: Power, Data, and Autonomy
The core idea driving the model is that data centralization is increasing, which diminishes individual autonomy while concentrating power in the hands of a few centralized entities (e.g., corporations and governments). The model aims to predict what could happen if this trend continues, based on three possible outcomes:

  • Continued Centralization: Data remains concentrated in a few powerful entities, further reducing individual autonomy.
  • Revolt: As people lose autonomy, there is potential for pushback or revolt against these centralized systems.
  • Balance Restored: New decentralized technologies (e.g., Solid, Verifiable Credentials) emerge to balance power between individuals and institutions, countering centralization.

 

2. Initial Probabilities: Setting the Stage
To model these outcomes, we needed to assign probabilities reflecting the current state of affairs. The numbers used are educated estimates based on current trends, societal movements, and technological development.

  • 70% Probability of Continued Centralization:
    This high probability reflects the ongoing dominance of centralized data systems, where large tech companies like Google, Facebook, Amazon, and Apple control vast amounts of global and personal data.
    Governments are also increasing their data collection efforts, whether for surveillance, taxation, or welfare purposes (e.g., China's social credit system, NSA programs in the US).
    The lack of widely adopted, viable alternatives forces users to rely on centralized services, reinforcing the trend of power centralization. This leads to a high probability of 70% for continued centralization. While this figure is subjective, it captures the strength of the current centralization trend, based on general observation of the technological landscape.
  • 20% Probability of Revolt:
    The probability of revolt reflects growing societal concerns about data privacy and centralization, though it’s not yet the dominant reaction.
    Privacy scandals, such as the Cambridge Analytica incident, and growing public concerns over surveillance (e.g., Snowden’s NSA revelations) suggest increasing discomfort with how data is used.
    Legislative responses like the GDPR in the EU show some governmental pushback, and movements for decentralized technologies, such as Web3 and blockchain, offer potential resistance.
    However, these movements are still smaller compared to the entrenched centralized powers, so the probability of revolt is set at 20%, indicating growing but not overwhelming dissatisfaction and resistance.
  • 10% Probability of Balance Restoration:
    Technologies like Solid and Verifiable Credentials offer promising alternatives that could restore balance, giving individuals control over their own data. These systems allow for decentralized identity and data management, which could counterbalance centralization.
    However, these technologies are in their early stages and face barriers to mass adoption due to complexity, scalability, and governance issues. Decentralized alternatives like DeFi, peer-to-peer networks, and blockchain-based solutions (e.g., IPFS, Filecoin) show promise but are not yet mainstream.
    Given these facts, the probability of these systems restoring balance in the near future is relatively low (10%), though not negligible.

3. Transitioning Over Time: The Dynamic Shift
Because the future is uncertain, the model assumes gradual changes over time, with the following reasoning:

  • Centralization decreases slightly: As new decentralized technologies grow in popularity, the trend toward centralization is not completely unchecked. For example, technologies like decentralized finance (DeFi) or community-driven data efforts could slowly reduce centralization. Hence, the probability of centralization decreases by 1% with each step.
  • Revolt probability increases: As centralization intensifies, so too does the probability of revolt. As people lose more autonomy and become more aware of privacy and control issues, resistance is expected to grow. Therefore, revolt probability increases by 0.5% at each step.
  • Balance probability increases: Similarly, as technologies like decentralized identifiers, blockchain, and other solutions gain traction, the probability of restoring balance increases. These technologies may gradually reduce the dominance of centralized systems. Thus, the probability of balance restoration also increases by 0.5% at each step.
  1. Normalization: Ensuring Realistic Probabilities
    At each time step, the total probability of all three scenarios must sum to 1 (or 100%). This is a common technique in probabilistic models to ensure the system remains closed and the probabilities are relative to each other. After calculating the changes in centralization, revolt, and balance probabilities at each time step, the values are normalized to maintain a sum of 1.
  2. Time Horizon: 20 Time Periods
    The time horizon of 20 steps is arbitrary but reasonable for showing gradual trends. The periods could represent years, decades, or any other meaningful unit depending on the context. The assumption is that these probabilities evolve slowly, as societal and technological changes typically take time to manifest.

Summary of Assumptions and Reasoning

  • 70% Centralization: Reflects the current dominance of centralized systems.
  • 20% Revolt: Accounts for growing discontent and emerging privacy/autonomy movements.
  • 10% Balance Restored: Represents the potential of new decentralized technologies, though they face adoption challenges.
  • Gradual Shift: Centralization decreases slightly over time, while revolt and balance probabilities increase due to growing awareness, technological advancements, and resistance movements.

Thought Experiment Approach

These probabilities are not based on empirical data but rather on qualitative judgment and logical reasoning based on observed trends. This kind of modeling is commonly used in strategic foresight, scenario planning, and policy design when hard data is unavailable. For example, futurists and policymakers often develop potential future scenarios using expert judgment and informed speculation to assign probabilities to various outcomes.

If real-world data were used, more precise estimates could be derived from public opinion surveys, adoption rates of decentralized technologies, growth trends in Big Tech, and legislative actions related to data privacy. However, in the absence of such data, these numbers represent educated guesses to provide a reasonable framework for thinking about the future of data governance.

Conclusion

The model’s probabilities reflect qualitative reasoning rather than empirical evidence. They provide an informed speculation on how data centralization, revolt, and balance restoration might evolve over time. While more precise data could refine the model, the current estimates offer a useful framework for considering the future trajectory of data control and its potential implications for society.

Solid and Verifiable Credentials (VCs) are two emerging technologies that aim to give individuals more control over their personal data and how it is shared and used. They are part of the broader movement toward decentralization and data privacy. Here's a breakdown of what they are and how they work:

  1. Solid
    Solid is a decentralized web platform developed by Tim Berners-Lee, the inventor of the World Wide Web. Its goal is to return data control to individuals and move away from the current model where large corporations and centralized entities own and manage vast amounts of personal data.

Key Features of Solid:

  • Decentralized Personal Data Stores (Pods): In Solid, your personal data is stored in "Pods" (Personal Online Data Stores), which you control. Instead of storing your data across various platforms like Facebook, Google, or Amazon, you keep your data in your own pod, which can be hosted on a server you trust or even on your own device.
  • Data Interoperability: Solid allows different applications to access your pod with your permission. This means you can use different apps to interact with your data, but the data remains under your control and doesn’t have to reside on those platforms' servers. For example, you could switch