Types of encoding information used in the computer. Encoding information in a computer

The same information can be presented in several forms. Basic coding methods allow this to be done in the modern world. After the advent of computer technology, it became necessary to encode any type of information that a person works with. But solving problems of this type began long before the advent of computers.

Method Navigator

1 way. Binary coding.

Binary coding is one of the most popular and widespread methods of presenting information. In working with computers, robots and machine tools with numerical control, information is most often encoded in the form of words in the binary alphabet.

Interesting: 10 Ways to Clean Up Your C Drive

2 way. Shorthand.

This method is referred to as methods of encoding textual information using special characters. This method is the fastest when recording oral speech. Shorthand skills are owned only by some specially trained people, who were given the name stenographers. Such people have time to write down the text in sync with the speech of the person who is speaking.

3 way. Synchronization.

In the process of working with digital information, synchronization is of particular importance. At the time of reading or writing information, it is important to accurately determine the time of each change of sign. If there is no synchronization, then the sign change period may not be determined correctly. As a result, data loss or corruption will inevitably occur.

4 way. Run Length Limited - RLL.

To date, one of the most popular methods is encoding information with a limitation on the length of the record field. Thanks to this method, one and a half times more data can be placed on the disc than in the process of recording using the MFM method. Using this method, not a single bit is encoded, but a whole group.

Interesting: 10 ways to protect files from theft

5 way. conversion tables.

A lookup table is one that contains a list of encoded characters ordered in a special way. Accordingly, the character is converted to its binary code and vice versa.

6 way. matrix method.

Matrix coding principle graphic images consists in the fact that the picture is divided into a given number of columns and rows. After that, each element of the resulting grid is encoded according to the selected rule.

Now write a comment!

Information happens different kind, For example:

Smell, taste, sound;

Symbols and signs.

In various branches of science, culture and technology, special forms have been developed for recording information.

Code is a group of symbols that can be used to display information.

The process of converting a message into a combination of characters according to a code is called coding.

Exists three main coding methods information:

Numerical way- using numbers.
Symbolic way - information is encoded using characters of the same alphabet as the outgoing text.
Graphical way - information is encoded using pictures or icons.

Information encoding examples:

To display the sounds of the Russian alphabet, use letters(ABVGDEEG…EYUYA);

Use to display numbers numbers (0123456789);

Sounds are recording notes and others symbols;

The blind use Braille, where the letter consists of six elements: holes and tubercles.

Braille alphabet

It should be borne in mind that without knowing the principles of encoding information, the same code can be understood in different ways, for example, the number 300522005 can be counted as a number, phone number or population.

The computer encodes the entered information: text, images and sounds. In coded form, the computer processes, stores and sends information. To display information from a computer in a human-readable form, it must be decode .

Encryption methods are engaged in a special science - cryptography .

In a computer, only two characters are used to encode any information: 0 And 1 , because computer technology it is easier to implement two states:

0 - no signal (no voltage or no current flows);

1 - there is a signal (there is voltage or current flows).

Code creation.

One bit can encode two states: 0 and 1 (yes and no, black and white). If you increase the number of bits by one, you will get twice as many codes.

Example:

Two bits create 4 different codes: 00, 01, 10 and 11;

three bits create 8 different codes: 000, 001, 010, 011, 100, 101, 110, and 111.

Coding different kinds of information

Text encoding

When encoding text, each character is assigned a value, such as a serial number.

The first popular computer text encoding standard is called ASCII(American Standard Code for Information Interchange), in which 7 bits are used to encode each character.

128 characters can be encoded with 7 bits: large and small Latin letters, numbers, punctuation marks, as well as special characters, for example, "§".

Different versions of the standard were created, supplementing the code to 8 bits (256 characters), so that national characters, for example, the Latvian letter ā, could be encoded.

But 256 characters were not enough to encode all the characters of different alphabets, so they created new standards. One of the most popular these days is UNICODE. In which each character is encoded with 2 bytes, it turns out as a result 62536 different codes.

Image data encoding

Almost all created and processed images stored in a computer can be divided into two groups:

Raster graphics;

Vector graphics.

Any image created in raster graphics consists of colored dots. These points are called pixels .

For coding non-color images usually use 256 shades of gray ranging from white to black. To encode all colors, 8 bits(1 byte).

For coding color images three colors are commonly used: red, green and blue. A color tone is obtained by mixing these three colors.

Sound encoding

Sounds come from hesitation air. Sound has two dimensions:

- oscillation amplitude, which points to volume sound;

- oscillation frequency, which points to key sound.

Sound can be converted into an electrical signal, for example, by a microphone.

The audio is encoded by measuring the size of the signal after a precise time interval and assigning a binary value to it. The more frequently these measurements are taken, the better quality sound.

Example:

One CD, with a capacity of 700 MB, can hold 80 minutes of CD quality audio.

Video encoding

The film consists of frames that change rapidly. An encoded movie contains information about the frame size, colors used, and the number of frames per second (usually 30), as well as the method of sound recording - each frame separately or the entire movie at once.

There is a constant exchange of information flows in the world. People can be sources technical devices, various things, objects of inanimate and living nature. One or more objects can receive information.

For better data exchange, information is simultaneously encoded and processed on the transmitter side (data is prepared and converted into a form convenient for broadcasting, processing and storage), forwarding and decoding is carried out on the receiver side (encoded data conversion to its original form). These are interrelated tasks: the source and receiver must have similar information processing algorithms, otherwise the encoding-decoding process will be impossible. Coding and processing of graphic and multimedia information is usually implemented on the basis of computer technology.

Encoding information on a computer

There are many ways to process data (texts, numbers, graphics, video, sound) with a computer. All information processed by a computer is represented in binary code - using the numbers 1 and 0, called bits. Technically, this method is implemented very simply: 1 - the electrical signal is present, 0 - absent. From a human point of view, such codes are inconvenient for perception - long strings of zeros and ones, which are encoded characters, are very difficult to decipher immediately. But such a recording format immediately clearly shows what information encoding is. For example, the number 8 in binary eight-digit form looks like the following bit sequence: 000001000. But what is difficult for a person is simple for a computer. It is easier for electronics to process many simple elements than a small number of complex ones.

Text encoding

When we press a button on the keyboard, the computer receives a specific code for the pressed button, looks for it in a standard table ASCII characters(American code for information exchange), “understands” which button is pressed and passes this code for further processing (for example, to display a character on a monitor). To store a character code in binary form, 8 bits are used, so the maximum number of combinations is 256. The first 128 characters are used for control characters, numbers and Latin letters. The second half is intended for national symbols and pseudographics.

Text encoding

It will be easier to understand what information encoding is with an example. Consider the codes of the English character "C" and the Russian letter "C". Note that the characters are uppercase, and their codes differ from lowercase ones. The English character will look like 01000010, and the Russian one will look like 11010001. What looks the same to a person on a monitor screen, a computer perceives completely differently. It is also necessary to pay attention to the fact that the codes of the first 128 characters remain unchanged, and starting from 129 and further, different letters can correspond to one binary code, depending on the code table used. For example, decimal code 194 can correspond to the letter “b” in KOI8, “B” in CP1251, “T” in ISO, and in CP866 and Mac encodings, not a single character corresponds to this code at all. Therefore, when, when opening the text, instead of Russian words, we see letter-symbol abracadabra, this means that such encoding of information does not suit us and we need to choose another character converter.

Number encoding

In the binary system, only two values \u200b\u200bare taken - 0 and 1. All basic operations with binary numbers are used by a science called binary arithmetic. These actions have their own characteristics. Take, for example, the number 45 typed on the keyboard. Each digit has its own eight-digit code in the ASCII code table, so the number occupies two bytes (16 bits): 5 - 01010011, 4 - 01000011 . In order to use this number in calculations, it is translated by special algorithms into the binary system in the form of an eight-digit binary number: 45 - 00101101.

In the 1950s, computers that were most often used for scientific and military purposes were the first to implement graphical display of data. Today, the visualization of information received from a computer is a common and familiar phenomenon for any person, and in those days it made an extraordinary revolution in working with technology. Perhaps the influence of the human psyche had an effect: visually presented information is better absorbed and perceived. A big breakthrough in the development of data visualization occurred in the 80s, when the coding and processing of graphic information received a powerful development.

Analog and discrete graphics representation

Audio encoding

Encoding of multimedia information consists in converting the analog nature of sound into a discrete one for more convenient processing. The ADC receives at the input measures its amplitude at certain time intervals and outputs a digital sequence with data on amplitude changes at the output. None physical transformations not happening.

The output signal is discrete, therefore, the more frequent the amplitude measurement frequency (sample), the more accurately the output signal corresponds to the input signal, the better the encoding and processing of multimedia information. A sample is also commonly referred to as an ordered sequence of digital data received through an ADC. The process itself is called sampling, in Russian - discretization.

The reverse conversion occurs with the help of a DAC: based on the digital data entering the input, at certain points in time, an electrical signal of the required amplitude is generated.

Sample Options

The main sampling parameters are not only the measurement frequency, but also the bit depth - the accuracy of measuring the change in amplitude for each sample. The more accurately the value of the signal amplitude in each unit of time is transmitted during digitization, the higher the quality of the signal after the ADC, the higher the reliability of wave recovery during inverse conversion.

Encoding textual information in a computer is sometimes an essential condition for the correct operation of the device or the display of a particular fragment. How this process occurs during the work of a computer with text and visual information, sound - we will analyze all this in this article.

Introduction

An electronic computer (which we call a computer in everyday life) perceives the text in a very specific way. For her, the encoding of textual information is very important, since she perceives each text fragment as a group of characters isolated from each other.

What are the symbols?

The role of symbols for the computer is not only Russian, English and other letters, but also punctuation marks, as well as other signs. Even the space with which we separate words when typing on a computer is perceived by the device as a symbol. Something very reminiscent of higher mathematics, because there, according to many professors, zero has a double meaning: it is a number, and at the same time does not mean anything. Even for philosophers, the question of a space in a text can become topical issue. A joke, of course, but, as they say, in every joke there is some truth.

What is the information?

So, in order to perceive information, a computer needs to start processing processes. And what kind of information is there? The topic of this article is the encoding of textual information. We will pay special attention to this task, but we will also deal with other micro-topics.

Information can be textual, numerical, sound, graphic. The computer must start processes that provide encoding of textual information in order to display what we type on the keyboard, for example. We will see symbols and letters, this is understandable. But what does the car see? It perceives absolutely all information - and now we are talking not only about text - as a certain sequence of zeros and ones. They form the basis of the so-called binary code. Accordingly, the process that converts the information received by the device into an understandable one is called “binary coding of text information”.

Brief principle of the binary code

Why is the most common electronic machines received exactly the encoding of information in binary code? The text base, which is encoded using zeros and ones, can be absolutely any sequence of characters and characters. However, this is not the only advantage that binary text encoding of information has. The thing is that the principle on which this coding method is arranged is very simple, but at the same time quite functional. When there is an electrical impulse, it is labeled (conditionally, of course) by a unit. No impulse - mark zero. That is, text encoding of information is based on the principle of constructing a sequence of electrical impulses. A logical sequence composed of binary characters is called machine language. At the same time, coding and processing of textual information using a binary code makes it possible to carry out operations in a fairly short period of time.

Bits and bytes

The figure perceived by the machine contains a certain amount of information. It is equal to one bit. This applies to every one and every zero, which make up one or another sequence of encrypted information.

Accordingly, the amount of information in any case can be determined simply by knowing the number of characters in the binary code sequence. They will be numerically equal to each other. 2 digits in the code carry information of 2 bits, 10 digits - 10 bits, and so on. The principle of determining the information volume, which lies in a particular fragment of binary code, is quite simple, as you can see.

Encoding text information in a computer

Right now you are reading an article that consists of a sequence, as we believe, of the letters of the Russian alphabet. And the computer, as mentioned earlier, perceives all information (and in this case too) as a sequence not of letters, but of zeros and ones, denoting the absence and presence of an electrical impulse.

The thing is that one character that we see on the screen can be encoded using a conventional unit of measure called a byte. As written above, the binary code has a so-called information load. Recall that numerically it is equal to the total number of zeros and ones in the selected code fragment. So, 8 bits make 1 byte. In this case, the combinations of signals can be very different, as you can easily see by drawing a rectangle on paper, consisting of 8 cells of equal size.

It turns out that it is possible to encode textual information using an alphabet that has a capacity of 256 characters. What is the point? The meaning lies in the fact that each character will have its own binary code. Combinations “attached” to certain characters start from 00000000 and end with 11111111. If you switch from binary to decimal number system, then you can encode information in such a system from 0 to 255.

Do not forget that now there are various tables that use the encoding of the letters of the Russian alphabet. These are, for example, ISO and KOI-8, Mac and CP in two variations: 1251 and 866. It is easy to make sure that the text encoded in one of these tables will not be displayed correctly in a different encoding. This happens due to the fact that in different tables different symbols correspond to the same binary code.

This was a problem at first. However, at present, special algorithms are already built into the programs that convert the text, bringing it to the correct form. 1997 was marked by the creation of an encoding called Unicode. In it, each character has at its disposal 2 bytes at once. This allows you to encode text that has a much larger number of characters. 256 and 65536: is there a difference?

Graphics encoding

Encoding textual and graphical information has some similarities. As you know, to display graphic information is used peripheral device computer called monitor. Graphics now (we are talking now about computer graphics) is widely used in various fields. Good, hardware capabilities personal computers allow you to solve fairly complex graphics problems.

It became possible to process video information in last years. But the text at the same time is much “lighter” than graphics, which, in principle, is understandable. Because of this, the final size of graphics files must be increased. It is possible to overcome such problems, knowing the essence in which graphic information is presented.

Let's first understand what groups this type of information is divided into. First, it's raster. Secondly, vector.

Raster images are quite similar to checkered paper. Each cell on such paper is painted over in one color or another. This principle is somewhat reminiscent of a mosaic. That is, it turns out that in raster graphics, the image is divided into separate elementary parts. They are called pixels. Translated into Russian, pixels mean “dots”. Logically, the pixels are ordered relative to the rows. The graphic grid consists of just a certain number of pixels. It is also called a raster. With these two definitions in mind, we can say that bitmap is nothing more than a set of pixels that are displayed on a rectangular grid.

Monitor raster and pixel size affect image quality. It will be the higher, the larger the raster of the monitor. Raster sizes are the screen resolution that every user has probably heard of. One of the most important characteristics that computer screens have is resolution, not just resolution. It shows how many pixels are in one or another unit of length. The resolution of a monitor is usually measured in pixels per inch. The more pixels per unit length, the higher the quality will be, since the “graininess” is reduced.

Audio stream processing

Coding of text and sound information, like other types of coding, has some peculiarities. We will now focus on the last process: encoding audio information.

The presentation of an audio stream (as well as a single sound) can be done in two ways.

Analogue form of sound information presentation

In this case, the value can take on a really huge number of different values. Moreover, these same values do not remain constant: they change very quickly, and this process is continuous.

Discrete Form of Sound Information Representation

If we talk about the discrete method, then in this case the value can take only a limited number of values. In this case, the change occurs in leaps and bounds. It is possible to encode discretely not only sound, but also graphic information. As for the analog form, by the way.

Analog audio information is stored on vinyl records, For example. But the CD is already a discrete way of presenting information of a sound nature.

At the very beginning, we talked about the fact that a computer perceives all information in machine language. To do this, information is encoded in the form of a sequence of electrical impulses - zeros and ones. Audio encoding is no exception to this rule. To process sound on a computer, you first need to turn it into that same sequence. Only after that, operations can be performed on a stream or a single sound.

When the encoding process occurs, the stream is subjected to temporal sampling. The sound wave is continuous, it develops over small sections of time. In this case, the amplitude value is set for each specific interval separately.

Conclusion

So, what did we find out in the course of this article? Firstly, absolutely all information that is displayed on a computer monitor, before appearing there, is encoded. Secondly, this coding consists in translating information into machine language. Thirdly, machine language is nothing more than a sequence of electrical impulses - zeros and ones. Fourth, for coding various characters there are separate tables. And, fifthly, to present a graphic and sound information possible in analog and discrete form. Here, perhaps, are the main points that we have analyzed. One of the disciplines that studies given area, is computer science. Encoding of textual information and its basics are explained at school, since there is nothing complicated about it.

A modern computer can process numerical, textual, graphic, sound and video information. All these types of information in a computer are presented in binary code, that is, an alphabet with a capacity of two characters (0 and 1) is used. This is due to the fact that it is convenient to represent information in the form of a sequence of electrical impulses: there is no impulse (0), there is an impulse (1). Such coding is usually called binary, and the logical sequences of zeros and ones themselves are called machine language.

Each digit of machine binary code carries the amount of information equal to one bit.

This conclusion can be drawn by considering the numbers of the machine alphabet as equally probable events. When writing a binary digit, it is possible to implement the choice of only one of the two possible states, which means that it carries an amount of information equal to 1 bit. Therefore, two digits carry information of 2 bits, four digits - 4 bits, etc. To determine the amount of information in bits, it is enough to determine the number of digits in a binary machine code.

Encoding of text information

Currently, most users use a computer to process textual information, which consists of characters: letters, numbers, punctuation marks, etc.

Based on one cell with an information capacity of 1 bit, only 2 different states can be encoded. In order for each character that can be entered from the keyboard in the Latin case to get its own unique binary code, 7 bits are required. Based on a sequence of 7 bits, in accordance with the Hartley formula, N=2 7 =128 different combinations of zeros and ones can be obtained, i.e. binary codes. By assigning each character its binary code, we get an encoding table. A person operates with symbols, a computer with their binary codes.

For the Latin keyboard layout, there is only one such encoding table for the whole world, so the text typed using the Latin layout will be adequately displayed on any computer. This table is called ASCII (American Standard Code of Information Interchange) in English it is pronounced [eski], in Russian it is pronounced [aski]. Below is the entire ASCII table, the codes in which are indicated in decimal form. It can be used to determine that when you enter, say, the symbol “*” from the keyboard, the computer perceives it as the code 42(10), in turn 42(10)=101010(2) - this is the binary code of the symbol “* ". Codes 0 to 31 are not used in this table.

ASCII character table

In order to encode one character, an amount of information equal to 1 byte is used, i.e. I \u003d 1 byte \u003d 8 bits. Using a formula that relates the number of possible events K and the amount of information I, you can calculate how many different characters can be encoded (assuming that characters are possible events):

K \u003d 2 I \u003d 2 8 \u003d 256,

i.e., an alphabet with a capacity of 256 characters can be used to represent textual information.

The essence of encoding is that each character is assigned a binary code from 00000000 to 11111111 or the corresponding decimal code from 0 to 255.

It must be remembered that currently five different code tables are used to encode Russian letters(KOI - 8, СР1251, СР866, Mac, ISO), moreover, texts encoded using one table will not be displayed correctly in another encoding. Visually, this can be represented as a fragment of the combined character encoding table.

Different symbols are assigned to the same binary code.

binary code	Decimal code

However, in most cases, it is not the user who takes care of transcoding text documents, but special programs- converters that are built into applications.

Since 1997 latest versions Microsoft Office support the new encoding. It's called Unicode. Unicode is an encoding table that uses 2 bytes to encode each character, i.e. 16 bits Based on such a table, N=2 16 =65,536 symbols can be encoded.

Unicode includes almost all modern scripts, including: Arabic, Armenian, Bengali, Burmese, Greek, Georgian, Devanagari, Hebrew, Cyrillic, Coptic, Khmer, Latin, Tamil, Hangul, Han (China, Japan, Korea), Cherokee, Ethiopian, Japanese (katakana, hiragana, kanji) and others.

For academic purposes, many historical scripts have been added, including: ancient Greek, Egyptian hieroglyphs, cuneiform, Mayan script, Etruscan alphabet.

Unicode provides a wide range of mathematical and musical symbols, as well as pictograms.

There are two ranges of codes for Cyrillic characters in Unicode:

Cyrillic (#0400 - #04FF)

Cyrillic Supplement (#0500 - #052F).

But the introduction of the Unicode table in its pure form is constrained by the fact that if the code of one character takes not one byte, but two bytes, it will take twice as much disk space to store the text, and twice as much time to transmit it through communication channels.

Therefore, the UTF-8 (Unicode Transformation Format) Unicode representation is now more common in practice. UTF-8 provides the best compatibility with systems using 8-bit characters. Text consisting only of characters less than 128 is converted to plain ASCII text when written in UTF-8. The remaining Unicode characters are represented by sequences of 2 to 4 bytes in length. In general, since the most common characters in the world - Latin characters - in UTF-8 still occupy 1 byte, this encoding is more economical than pure Unicode.

To determine the numeric character code, you can either use the code table. To do this, select the "Insert" - "Symbol" item in the menu, after which the Symbol dialog box appears on the screen. The character table for the selected font appears in the dialog box. The characters in this table are arranged line by line, sequentially from left to right, starting with the Space character.