Shannon entropy … too much Data, bad bad bad

Too many words (Data) for one piece of information? How Shannon’s entropy imposes fundamental limits on communication

Dealing with data, we asked ourselves something fundamental about the topic: how much data does it take to give complete information?

This, because the direct relationship is not true: more data means more information. In our work at A.I.LoveTourism we deal daily with another concept , different from the one we are dealing with, which is the Overfitting of datasets … but, it is another topic.

But, back to the topic, are there any theories that can calculate a priori how much information might be enough to construct whole concepts?

Yes, we talk about the concept of Entropy du Shannon

What is a message, really? Claude Shannon recognized that the key ingredient is surprise.

If someone tells you a fact that you already know, he has essentially told you nothing. If, on the other hand, he tells you a secret, it can be said that he has communicated something to you.

This distinction is at the heart of Claude Shannon’s theory of information. Introduced in a landmark 1948 paper, “A Mathematical Theory of Communication,” it provides a rigorous mathematical framework for quantifying the amount of information required to accurately send and receive a message, determined by the degree of uncertainty about what the intended message might say.

Let’s take an example.

In a hypothetical scenario, I have a rigged coin: it is heads on both sides. I flip it twice. How much information is needed to communicate the result?

None, because before you receive the message you have absolute certainty that both launches will yield positive results.

In the second scenario, I do the two flips with a regular coin: heads on one side, tails on the other. We can communicate the result using binary code: 0 for leaders, 1 for tails. There are four possible messages – 00, 11, 01, 10 – and each requires two bits of information.

So what is the point? In the first scenario, there is the absolute certainty of the content of the message and zero bits are required to transmit it. In the second case, there was a one in four chance of guessing the right answer, with 25% certainty, and the message required two bits of information to resolve the ambiguity. More generally, the less is known about what the message will say, the more information is needed to convey it.

neurone artificiale

Entropy in communication

Shannon , don’t fool him

Shannon was the first to make this relationship mathematically precise. He captured it in a formula that calculates the minimum number of bits–a threshold later called Shannon entropy–necessary to communicate a message. He also showed that if a sender uses fewer than the minimum number of bits, the message will inevitably be distorted.

“He had the great insight that information is greatest when you are most surprised to know something,” said Tara Javidi, information theorist at the University of California, San Diego.

The term “entropy” is borrowed from physics, where entropy is a measure of disorder. A cloud has a higher entropy than an ice cube because a cloud allows water molecules to be arranged in many more ways than the crystalline structure of a cube. Similarly, a random message has a high Shannon entropy-there are many possibilities for the arrangement of information-while one that obeys a rigid pattern has a low entropy. There are also formal similarities in the way entropy is calculated in both physics and information theory. In physics, the entropy formula is to take the logarithm of possible physical states. In information theory, it is the logarithm of possible outcomes of events.

Shannon’s logarithmic entropy formula belies the simplicity of what it captures, because another way to think of Shannon’s entropy is the number of “yes or no” questions needed, on average, to ascertain the content of a message.

For example, imagine two weather stations, one in San Diego and the other in St. Louis. Each wants to send the other the seven-day forecast for their city. In San Diego it is almost always sunny, which means there is high confidence in what the forecast will say. The weather in St. Louis is more uncertain: the probability of a sunny day is closer to 50 percent.

Information and entropy

• In general, by observing a data stream one can attempt to unravel the original content. A related problem is the extent of information that can be extracted from (or contained in) a data stream.

• The concept of Informational Entropy arises precisely from the need to quantify the information content of abstract sequences of signals.

• For the concept to meet our needs well, it must give zero information content to sequences of random numbers.

Data compression

Utilization in everyday life

The entropy rate of a data source indicates the average number of bits per symbol required to encode it. Shannon’s experiments with human predictors show an information rate between 0.6 and 1.3 bits per character in English; the PPM Compression Algorithm can achieve a compression ratio of 1.5 bits per character in English text.

Shannon’s definition of entropy, when applied to an information source, can determine the minimum channel capacity required to reliably transmit the source as encoded binary digits. Shannon entropy measures the information contained in a message as opposed to the portion of the message that is determined (or predictable). Examples of the latter include redundancy in language structure or statistical properties relating to the frequencies of occurrence of pairs of letters or words, triplets, etc.

Minimum channel capacity can be realized in theory using the typical set or in practice using Huffman, Lempel – Ziv or arithmetic coding. See also Kolmogorov complexity. In practice, compression algorithms deliberately include judicious redundancy in the form of checksums to protect against errors.

A 2011 study in Science estimates the world’s technological capacity to store and communicate optimally compressed information normalized to the most effective compression algorithms available in the year 2007, thus estimating the entropy of technologically available sources

TUTTI I SANTI PIU’ UNO – Un prodotto A.I.LoveTourism

Contact Us

Address

via Ammiraglio Millo, 9 ( Alberobello – Bari)

Email

info@ailovetourism.com

Phone

+39 339 5856822