The world's most valuable AI data is secured. Not by Technology. By Trust.



Let's change the perspective on what I think OpenLedger is fundamentally all about.
Fairness is the common theme of the story. Pay data contributors. Democratize AI. Pay the individuals who created the models.

That's real. And it matters.
However, I believe it doesn't address the more economically relevant issue that OpenLedger will face.

Not data compensation. Data unlocking.
This is the data problem no one is saying clearly enough.

Most of the best training information in the world is not on the Internet.

It's within hospital systems that are not capable of communicating patient information without breaking HIPAA regulations. It's in data bases of pharmaceutical research, inaccessible because of confidentiality agreements. It's in banks with years and years of proprietary trading data. It's in law firms where cases would change the nature of AI legal reasoning. What’s in manufacturing plants where sensor data could transform the way we do predictive maintenance.

This data exists. It's genuinely valuable. If a company develops an AI system on it, it would have a much greater power than an AI system developed with data from the internet, which is public.

But it's nearly impossible that there is enough of it that can actually be used for AI training.

Not due to technical problems. As a result of the lack of trust.

As to patient information, the hospital is unable to ensure that information will not be shared and you're unable to ensure what happens to it after it is shared. Since the pharmaceutical company cannot demonstrate attribution and control when using downstream, it can not share any research data. The financial institution may be unable to provide trading data because they are unable to keep the necessary audit logs for regulatory purposes.

These are no technology issues. These are source and traceability issues.

Who used this data? For what purpose? Can we prove it? Can we audit it? Is it possible to impose restrictions on its sharing?

If the answers are not given to those questions the data remains locked. Whatever amount of money you can provide to the donors.

It's at this point where OpenLedger's infrastructure gets interesting, beyond the "pay contributors fairly" story.

Proof of Attribution isn't simply a payment rails. It supports the Generation of verifiable Data Lineage.
All of the datasets added to a Datanet are accompanied by a cryptographic history of where they came from, how they have been used and how they affect the model output. It is on-chain and not part of any individual organisation's database and can be audited by anyone who has proper access.

For the hospital that is thinking about sharing de-identified patient data that's a chain of custody that is legally defensible for the hospital. We can demonstrate what has occurred with this data. We can prove that it is only being used for what we've disclosed. The audit trail can be provided if required by compliance staff and regulators.

That's attribution evidence that safeguards the proprietary research of the pharmaceutical company. Evidence that our data was used can be provided. We have the ability to enforce the licensing terms. We can join in the value our contribution brings to our underlying research without putting them at risk.

That's the regulatory paperwork that will make AI integration possible for the financial institution. It is possible to show how data governance can be demonstrated. We are able to meet audit needs. We don't need to sacrifice confidentiality requirements in developing AI.

The potential market represented by this unlock is much bigger than the market for rewarding individual data contributors.

The next frontier of improving AI capability is enterprise data, the locked, high-value, domain-specific data that exists in institutional systems. These will be dramatically more powerful than the models which cannot access it.

Nor is it because the organizations that hold the data do not want to be part of the development of AI.

Many of them are desiring to. The reason that they're not making it available is because they cannot meet the trust, provenance and audit standards that are expected for data sharing responsibly.

The infrastructure of OpenLedger might be that key.
It's not for people with a few points to collect for writing samples. For institutions not just unlocking billions worth of data assets but also for AI training value.

I want to put it out there this takes a lot of work.
The implementation of blockchain-based data infrastructures into enterprise isn't quick. The procurement process can take a long time. Legal review is very thorough. Integrating with existing data governance is difficult.

The individual contributor market (where individual people upload data sets to Datanets for a token reward) can increase significantly in size. The institutional market operates at a completely different pace.

There's a chicken and egg problem. Institutions would be integrated when infrastructure is proven at scale. It's only when enough institutions integrate that the infrastructure is put to the test.

There can be only one way to break this cycle either someone will have to be the first to buy a marquee (or the first to sell one) or the cost of passing on valuable data without any possible way to share it will make it more expensive than to invest in new infrastructure.

Both these conditions are developing. So far, they aren't here.

But, what I keep getting back to.

The Internet did not open its potential for people to share information. It truly came into its own when other institutions banks, retailers, media, governments started to develop atop it.

The base is its individual contributor layer, OpenLedger. Fascinating, concrete, valuable to construct.

The economic scale resides at the institutional level.
If OpenLedger can create the trust network necessary to make it possible to share institutional data and if the regulatory environment continues changing in a way that will make it a compliance requirement and not a nice to have then $OPEN is not being priced for that addressable market.

Not even close.

This is either a lot of business or a great opportunity.
Or an idea that doesn't come to fruition quickly enough for many $OPEN holders to wait for.

I'm really not sure which of the two.

But the institutional adoption signals are more closely monitored than the individual contributor signals.

That's where the real market resides, after all.
Are you convinced that enterprise data unlocking is OpenLedger's big ticket opportunity, greater than individual contributor compensation?

@OpenLedger $OPEN #OpenLedger
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned