Baozou Commentary : Nowadays, people's lives are increasingly inseparable from data analysis. From organizational operations to medical plan decisions, and even in determining the training routes of athletes, we need to rely on data to formulate plans. But how do we know that we really use this data? The issue of data trust has nothing to do with technology or academia, but it is by no means an issue that can be ignored. This article will explore this in depth. Translation: spring_zqy Nowadays, everyone lives in the era of big data, and I’m not trying to scare you by saying this. Data analysis and data-driven decision making are the gold standard in more and more areas. From organizational operations to medical decision making and even in determining athlete training routes, data is increasingly at the core of life. Many people think this is necessary. I would like to ask a naive question: how do we know that we really use this data? Let me be clear here that I am not questioning the use of data to improve decision making. On the contrary, I think this is exactly what we should do. My question is not about data maintenance technology. (You can Baidu "data integrity", "data loss", "data corruption" and other entries, and you can see that there are countless standards and best practices to address these). My question is actually very basic. Assuming there is a dataset related to the event that needs to be decided, and there is nothing technically wrong with it, how can we actually rely on this content? Building Trust If you think about it, you can quickly see why we have to rely on data: vertical integration and reliable third parties. Let me explain it to you with two simplest examples. Vertical integration includes any scenario where we own the entire value chain that makes up the data set. Like a Fitbit (or other similar activity tracker), you own the sensor that generates the data, and you are there when the data is generated. At the end of the day, you get data that shows your activity for the day, so you can say, "I moved a little more today, I walked 3,000 steps today compared to the usual 2,000 steps." In fact, every step in the creation of the activity data set is owned by you, so it's easy to accept the validity of this type of data. A trusted third party includes any party that we trust to generate or track data, thereby accepting the validity of the data. When I want to know how many people visit my web pages, I check Google Analytics. Because I trust Google (although I wonder if I should), I accept the validity of its data. Gray Area These examples are simple, but the scope of data dependency problems is obviously large. The scenarios I mentioned above show that unless we own the data generating process, we are forced to trust other parties before we can rely on their data. The complexity of this situation becomes apparent when considering the reliance on data generated by the Internet of Things. In a tragedy in 2007, the I-35W Mississippi River Bridge collapsed, killing 13 people. The bridge has since been rebuilt and now incorporates more than 500 sensors to monitor the bridge's tension, load distribution, vibration, temperature, and more. On the surface, this seems to have solved the problem. If there are any signs of concern, the sensors will tell us in advance, and we can send maintenance teams to prevent disaster. But for this to work, we need to rely entirely on sensor data, so the question is - why wouldn't we? Consider a scenario where a sensor is malfunctioning and is continuously reporting that the bridge is fine. For simplicity, we assume that the sensor generates two types of data: one is "good", indicating that the bridge is fine, and the other is "bad", indicating that we need to send a maintenance team. We can imagine that there is no "data vertical integration" technology, that is, the IoT sensor is owned by Company A, and the maintenance team comes from Company B. If the bridge collapses when the sensors transmit "good" information, then Company A will be fully responsible - its sensors did not transmit correct information. However, if the sensor has sent an indication of "bad" and no maintenance team has arrived on site, then Company B is fully responsible. This gives Company A ample incentive to "cheat" by altering the data set to show that the sensor did send an alert, but Company B ignored it. I use this sad example to show the severity of the problem, but in fact the situation I am talking about can easily occur in your TV, dishwasher, or other IoT device applications. Blockchain Solutions Based on large market trends and reflections - this technology will soon be available to everyone. I would also like to remind you that the issue of data dependency is not an academic issue at all, nor is it an issue that we can easily label as "worry about the problem later". Law enforcement agencies around the world are already expanding their use of cellular data to locate criminal suspects. The mined data can be used as supporting evidence to incriminate suspects or provide proof of absence. In a previous case, a network engineer was accused of killing his wife. He tried to use this technology to fake the signs of his wife calling him through his access to network equipment, when in fact his wife was already dead. I think this issue of data dependence is very important and we need to constantly reflect on it. As it happened, I was in a position where I could not only raise this question, but also propose a solution. It was clear to me that we already had an infrastructure to decide what was right and wrong and reach consensus among peers without having to own all the data ourselves, or blindly trusting someone else’s data. That architecture is of course blockchain technology, which I think integrates with many existing systems and is critical for us to take data reliance to a new level. In particular, I believe blockchain-based solutions are needed to unlock more value from data, and here are some of the areas I would like to mention: Data analysis Insurance claims Records management Regulatory compliance In short, data is becoming more and more important to our lives, and I will repeatedly mention the issue of data dependence. To address this, we need to ask the question, this is not just a question of data ownership, nor is it a question of the assumption that any other party “might be reliable.” Blockchain technology will provide an efficient and highly applicable solution to this problem and will ultimately generate a universal standard for data dependency issues. |
<<: World Economic Forum: Blockchain’s mainstream adoption is inevitable, will Bitcoin follow?
OK and Xu Mingxing are facing the biggest crisis ...
Palmistry is not only related to our personal for...
Do women with willow-shaped eyebrows have good lu...
Both men and women have moles on their bodies. In...
If a woman has a face that brings bad luck to her...
Source: China Blockchain News (ID: ChinaBlockchai...
What does it mean when the career lines are inter...
On October 30, the “The First Practical Applicati...
China Securities Network News (Reporter Wang Zhou...
As a cutting-edge technology in the field of Inte...
Is it good for a woman with a high forehead? The ...
In April, Bitcoin saw six consecutive positive mo...
People who are destined to become high-ranking of...
Analyzing worry and love through eyes 1. People w...
In today's society, there are many men who ha...