Failure rate, a general formula for the probability of failure-free operation. Reliability and survivability of onboard computing systems (BCVS) The failure rate is expressed in terms of

Methodology for assessing the failure rate of functional units integrated circuits

Baryshnikov A.V.

(FGUP Research Institute “Automation”)

1. Introduction

The problem of predicting the reliability of radio electronic equipment (REA) is relevant for almost all modern technical systems. Considering that REA includes electronic components, the problem arises of developing techniques that allow estimating the failure rate (FR) of these components. Often technical requirements in terms of reliability, specified in the terms of reference (TOR) for the development of REA, are in conflict with the requirements for the weights and dimensions of REA, which does not allow fulfilling the requirements of the TOR due, for example, to duplication.

For a number of types of electronic equipment, increased reliability requirements are imposed on control devices located in the same chip with the main functional units of the equipment. For example, to the modulo 2 addition scheme, which provides control over the operation of the main and backup units of any equipment unit. Increased reliability requirements can also be imposed on memory areas that store information necessary to execute the algorithm of the equipment.

The proposed technique makes it possible to evaluate the IE of different functional areas of microcircuits. In memory chips: Random Access Memory (RAM), Read Only Memory (ROM), Reprogrammable Memory (RPM), these are the failure rates of drives, decoders, and control circuits. In the circuits of microcontrollers and microprocessors, the technique allows you to determine the IR of memory areas, an arithmetic logic unit, analog-to-digital and digital-to-analog converters, etc. In field-programmable logic integrated circuits (FPGAs), the IO of the main functional units that make up the FPGA: configurable logic block, input / output block, memory areas, JTAG, etc. The technique also makes it possible to determine the IR of one output of the microcircuit, one memory cell, and, in some cases, the IR of individual transistors.

2. Purpose and scope of the methodology

The technique is designed to evaluate the operational IS λ e of various functional units of microcircuits: microprocessors, microcontrollers, memory microcircuits, programmable logic integrated circuits. In particular, inside the crystalline regions of the memory, as well as the IO cells of the memory storage devices of foreign-made microcircuits, including microprocessors, FPGAs. Unfortunately, the lack of information about the IO packages does not allow us to apply the methodology for domestic microcircuits.

RI determined by this method are the initial data for calculating the reliability characteristics when conducting engineering studies of equipment.

The methodology contains an algorithm for calculating the IR, an algorithm for checking the obtained calculation results, examples of calculating the IR of the functional units of the microprocessor, memory circuits, programmable logic circuits.

3. Assumptions of the methodology

The methodology is based on the following assumptions:

Element failures are independent;

The IR of the microcircuit is constant.

In addition to these assumptions, the possibility of dividing the IC IC into the package IC and the chip failure rate will be shown.

4. Initial data

1. Functional purpose of the microcircuit: microprocessor, microcontroller, memory, FPGA, etc.

2. Chip manufacturing technology: bipolar, CMOS.

3. The value of the failure rate of the microcircuit.

4. Block diagram of the microcircuit.

5. Type and volume of storage circuits of memory.

6. Quantity of conclusions of the case.

5.1. According to the known values of the IR of the microcircuit, the IR of the case and the crystal are determined.

5.2. Based on the found value of the RI of the crystal, for the memory chip, based on its type and manufacturing technology, the RI of the drive, decoder circuits, and control circuits are calculated. Calculation based on standard construction electrical circuits serving the drive.

5.3. For a microprocessor or microcontroller, using the results of the calculation obtained in the previous paragraph, the IO of the memory areas are determined. The difference between the RI of the crystal and the found values of the RI of the memory areas will be the RI of the rest of the chip.

5.4. Based on the known values of the RI of crystals for the FPGA family, their functional composition and the number of nodes of the same type, a system of linear equations is compiled. Each of the equations of the system is compiled for one type rating from the FPGA family. The right side of each of the equations of the system is the sum of the products of the values of the IO of functional units of a certain type and their number. The left side of each of the equations of the system is the value of the RI of a crystal of a specific type of FPGA from the family.

Maximum amount equations in the system is equal to the number of FPGAs in the family.

The solution of the system of equations allows you to get the values of the IR of the functional units of the FPGA.

5.5. Based on the results of the calculation obtained in the previous paragraphs, the values of the IR of a separate memory cell, the output of a microcircuit or a transistor of a particular block diagram node can be found if the electrical circuit diagram of the node is known.

5.6. The calculation results for a memory chip are checked by comparing the RI value for another memory chip, obtained by the standard method, with the RI value of this microcircuit, calculated using the data obtained in clause 5.2 of this section.

5.7. Verification of the calculation results for FPGAs is carried out by calculating the IE of a crystal of one of the ratings of the considered FPGA family, which was not included in the system of equations. The calculation is carried out using the values of the IR of the functional units obtained in clause 5.4 of this section, and comparing the obtained value of the FPGA IR with the value of the IR calculated using standard methods.

6. Analysis of the model for predicting the failure rate of microcircuits in terms of the possibility of dividing the failure rate of a microcircuit by the sum of the failure rates of a chip and a package

The IE of the crystal, case and external pins of the microcircuit are determined from the mathematical model for predicting the IE of foreign integrated circuits for each type of IC.

Let us analyze the terms of the mathematical model for calculating the operational

ionic IE λ e digital and analog integrated circuits of foreign production:

λ e = (С 1 π t +С 2 π E) π Q π L, (1),

where: C 1 - component of IS IS, depending on the degree of integration;

π t - coefficient taking into account the overheating of the crystal relative to the environment;

C 2 - component IS IS, depending on the type of housing;

- π E - coefficient taking into account the severity of the operating conditions of REA (equipment operation group);

- π Q - coefficient taking into account the level of quality of manufacturing ERI;

- π L - coefficient, taking into account the development of the technological process of manufacturing ERI;

This expression is valid for microcircuits made both by bipolar and MOS technologies, and includes digital and analog circuits, programmable logic arrays and FPGAs, memory microcircuits, microprocessors.

The mathematical model of the predictive IO of integrated circuits, which is based on the US Department of Defense standard, is the sum of two terms. The first term characterizes failures determined by the degree of crystal integration and the electrical operating mode of the microcircuit (coefficients C 1, π t), the second term characterizes failures associated with the type of package, the number of package pins and operating conditions (coefficients C 2, - π E).

This separation is explained by the possibility of producing the same microcircuit in different types of cases, which differ significantly in their reliability (vibration resistance, tightness, hygroscopicity, etc.). Let us designate the first term as IR determined by the crystal (λcr ), and the second - by the body (λcorp).

From (1) we get:

λcr = С 1 π t π Q π L, λkorp = С 2 π E π Q π L (2)

Then the IE of one output of the microcircuit is equal to:

λ 1vyv \u003d λkorp /N vyv \u003d C 2 π E π Q π L /N vyv,

where N Pin is the number of pins in the package of the integrated circuit.

Let's find the ratio of the IE of the case to the operational IE of the microcircuit:

λcorp / λ e = С 2 π E π Q π L / (С 1 π t +С 2 π E) π Q π L = С 2 π E /(С 1 π t +С 2 π E) (3)

Let us analyze this expression from the point of view of the impact on it of the package type, the number of pins, the overheating of the crystal due to the power dissipated in the crystal, and the severity of the operating conditions.

6.1. Influence of severe operating conditions

Dividing the numerator and denominator of expression (3) by the coefficient π E we get:

λcorp / λ e \u003d C 2 / (C 1 π t / π E + C 2) (4)

The analysis of expression (4) shows that the percentage ratio of the package IE and the operational IS of microcircuits depends on the operation group: the more severe the operating conditions of the equipment (the greater the value of the coefficient π E), the greater the proportion of failures is due to package failures (the denominator in equation 4 decreases) and attitudeλcorp / λe tend to 1.

6.2. Influence of Package Type and Number of Package Pins

Dividing the numerator and denominator of expression (3) by the coefficient C 2 we get:

λcorp / λ e \u003d π E / (С 1 π t / С 2 + π E) (5)

The analysis of expression (5) shows that the percentage ratio of the IS of the package and the operational IS of the microcircuits depends on the ratio of the coefficients C 1 and C 2, i.e. on the ratio of the degree of integration of the microcircuit and the package parameters: the greater the number of elements in the microcircuit (the greater the coefficient C 1), the smaller the proportion of failures is accounted for by package failures (ratioλcorp / λ e tend to zero) and the greater the number of pins in the package, the more weight the package failures acquire (the ratioλcorp / λ e strive for 1).

6.3. Influence of power dissipated in the crystal

It can be seen from expression (3) that with an increase in πt (a coefficient reflecting the overheating of the crystal due to the power dissipated in the crystal), the value of the denominator of the equation increases, and, consequently, the proportion of failures per case decreases and the failures of the crystal acquire a greater relative weight.

Conclusion:

Analysis of the change in the value of a ratio λcorp / λ e (equation 3) depending on the package type, the number of leads, the crystal overheating due to the power dissipated in the crystal, and the severity of the operating conditions, showed that the first term in equation (1) characterizes the operational RI of the crystal, the second - the operational RI of the package and equations (2) can be used to evaluate the operational ROI of the semiconductor chip itself, the package, and the ROI of the package leads. The value of the operational RI of a crystal can be used as a starting material for assessing the RI of the functional units of microcircuits.

7. Calculation of the failure rate of a memory cell of storage devices that are part of memory chips, microprocessors and microcontrollers.

To determine the IE per bit of semiconductor memory information, consider their composition. The composition of a semiconductor memory of any type includes, :

1) Drive

2) Framing scheme:

o address part (row and column decoders)

o numerical part (amplifiers for writing and reading)

o local control unit - coordinates the work of all nodes in the modes of storage, recording, regeneration (dynamic memory) and erasure of information (EPROM).

7.1. Estimation of the number of transistors in various areas of memory.

Consider each component of the memory IO. The total value of memory IE for different types of chips with different storage capacities can be determined using. The RI of the package and the crystal are calculated in accordance with Section 5 of this work.

Unfortunately, in technical materials for foreign memory chips, there is no total number of elements included in the chip, and only the information capacity of the drive is given. Given the fact that each type of memory contains standard blocks, let's estimate the number of elements included in the memory chip, based on the size of the drive. To do this, consider the circuitry for constructing each block of memory.

7.1.1. RAM storage

In the electrical circuit diagrams of RAM memory cells made using TTLSH, ESL, MOS and CMOS technologies are given. Table 1 shows the number of transistors that make up one memory cell (1 bit of RAM information).

Table 1. Number of transistors in one memory cell

RAM type	Manufacturing technology
RAM type		TTLSH	ESL	MOS	CMOS
Static	Amount of elements				4, 5, 6
Dynamic	Amount of elements

7.1.2. ROM and PROM drives

In bipolar ROMs and PROMs, the storage element of the drive is implemented on the basis of diode and transistor structures. They are implemented in the form of emitter followers on n-p-n and p-n-p transistors, collector-base junctions, emitter-base junctions, Schottky diodes. As a memory element in circuits manufactured using MOS and CMOS technologies, p and n channel transistors. The memory element consists of 1 transistor or diode. The total number of transistors in a ROM or PROM drive is equal to the information capacity of the LSI memory.

7.1.3. RPZU drive

The information recorded in the EPROM is stored from several to tens of years. Therefore, EPROM is often referred to as non-volatile memory. At the heart of the mechanism

Mining and storage of information are the processes of charge accumulation during recording, its storage during reading and when the power is turned off in special MOS transistors. The memory elements of RPZU are built, as a rule, on two transistors.

Thus, the number of transistors in the RPZU drive is equal to the information capacity of the RPZU multiplied by 2.

7.1.4. Address part

The address part of the memory is built on the basis of decoders (decoders). They allow you to determine N - bit input binary number by obtaining a single value of a binary variable at one of the outputs of the device. To build integrated circuits, it is customary to use linear decoders or a combination of linear and rectangular decoders. The linear decoder has N inputs and 2 N logic circuits "AND". Let's find the number of transistors needed to build such decoders in the CMOS basis (as the most commonly used for creating LSI). Table 2 shows the number of transistors required to build decoders for a different number of inputs.

Table 2. The number of transistors required to build decoders

Qty Inputs	Address inverters		Schemes "I"		The total number of transistors in the de-coder 2* N 2 N +2 N
Qty Inputs	Qty inverters	Qty transistors	Qty schemes	Number of transistors 2* N *2 N

				4*4=16	16+4=20

				6*8=48	48+6=54
				8*16=128	128+8=136
				10*32 = 320	320+10 = 330
				64*12 = 768	768+12 = 780
				128*14=1792	1792+14=1806
				256*16=4096	4096+16=4112
				512*18=9216	9216+18=9234
			1024	1024*20=20480	20480+20=20500

For linear decoders, the bit depth of the decoded number does not exceed 8-10. Therefore, when the number of words in the memory increases by more than 1K, the modular principle of constructing the memory is used.

7.1.5. Numeric part

(amplifiers for recording and reading)

These circuits are designed to convert the levels of read signals into the output levels of logic elements of a particular type and increase the load capacity. As a rule, they are performed in an open-collector (bipolar) or tri-state (CMOS) circuit. Each of the output circuits can consist of several (two or three) inverters. The maximum number of transistors in these circuits with the maximum capacity of the microprocessor 32 is no more than 200.

7.1.6. Local control unit

The local control block, depending on the type of memory, may include row and column buffer registers, address multiplexers, regeneration control blocks in dynamic memory, and information erasure circuits.

7.1.7. Estimation of the number of transistors in different areas of the memory

The quantitative ratio of RAM transistors included in the drive, decoder and local control unit is approximately equal to: 100:10:1, which is 89%, 10% and 1%, respectively. The number of transistors in a storage cell of RAM, ROM, PROM, RPZU is shown in Table 1. Using the data in this table, the percentages of elements included in different areas of the RAM, and also assuming that the number of elements in the decoder and local control unit for the same storage capacity different types The memory remains approximately constant, it is possible to estimate the ratio of transistors included in the storage, decoder and local control unit of different types of memory. Table 3 shows the results of such an assessment.

Table 3 Quantitative ratio of transistors in different functional areas of the memory

	Quantitative ratio of elements of different memory areas
	Storage device	Decoder	Local control unit

ROM, PROM

Thus, knowing the volume of the drive and the IO of the memory chip, you can find the IO of the drive, the address part, the numerical part, the local control unit, as well as the IO of the memory cell and transistors that are part of the framing circuits.

8. Calculation of the failure rate of functional units of microprocessors and microcontrollers

The section provides an algorithm for calculating the IR of functional units of microprocessor and microcontroller microcircuits. The technique is applicable for microprocessors and microcontrollers with a capacity of no more than 32 bits.

8.1. Initial data for calculating the failure rate

Below are the initial data necessary for the calculation of the IS of microprocessors, microcontrollers and parts of their electrical circuits. Under the part of the electrical circuit we will understand as functionally complete nodes of the microprocessor (microcontroller), namely, different types memories (RAM, ROM, PROM, RPZU, ADC, DAC, etc.), as well as individual valves or even transistors.

Initial data

Bit depth of the microprocessor or microcontroller;

Microchip manufacturing technology;

View and organization within the crystal memory;

Information capacity of the memory;

Power consumption;

Thermal resistance crystal - case or crystal - environment;

Chip package type;

Quantity of conclusions of the case;

Increased operating ambient temperature.

Manufacturing quality level.

8.2. Algorithm for calculating the failure rate of the microprocessor (microcontroller) and functional units of the microprocessor (microcontroller)

1. Determine the operational IS of the microprocessor or microcontroller (λe mp) using the initial data using one of the automated calculation programs: “ACRN”, “Asonika-K” or using the “Military HandBook 217F” standard.

Note: Further, all calculations and comments will be given from the point of view of the application of the ASRN, because. methodologies of use and the content of the programs, "Asonika-K" and the standard "Military HandBook 217F" have much in common.

2. Determine the value of the IO memory included in the microprocessor (λ E RAM, λ E ROM, PROM, λ E EPROM), assuming that each memory is a separate chip in its own package.

λ E RAM = λ RAM + λcorp,

λ E ROM, PROM = λ ROM, PROM + λcorp,

λ E RPZU = λ RPZU + λkorp,

where λ E - operational values of IE of different types of memory, λcorp, - IE of cases for each type of memory: λ RAM, λ ROM, PROM, λ RPZU - IO RAM, ROM, PROM, RPZU without taking into account the case, respectively.

The search for initial data for calculating the operational values of IE of different types of memory is carried out according to technical information (Data Sheet) and catalogs of integrated circuits. In the indicated literature, it is necessary to find the memory, the type of which (RAM, ROM, PROM, EPROM), the storage capacity, organization and manufacturing technology are the same or close to the memory included in the microprocessor (microcontroller). The found technical characteristics of memory chips are used in the ASRN to calculate the operational IE of memory chips. The power consumed by the memory is selected based on the electrical mode of operation of the microprocessor (microcontroller).

3. Determine the values of IE inside the crystal areas of the microprocessor (microcontroller), memory and ALU without taking into account the case: λcr mp, λ RAM, λ ROM, PROM, λ RPZU, . λ ALU

IO inside the crystal areas of the microprocessor, RAM, ROM, PROM, RPZU are determined from the relation: λcr = С 1 π t π Q π L.

The IO of the ALU and the chip part without memory circuits is determined from the expression:

. λ ALU \u003d λcr mp - λ RAM - λ ROM, PROM - λ RPZU

The IO values of other functionally complete parts of the microprocessor (microcontroller) are found in a similar way.

4. Determine the IO of drives inside the crystal memory: λ N RAM, λ N ROM, PROM, λ N RPZU.

Based on the data in Table 3, it is possible to express the percentage of the number of transistors in different functional areas of the memory, assuming that the total number of transistors in the memory is 100%. Table 4 shows this percentage of transistors included in the internal crystal memory of various types.

Based on the percentage of the number of transistors included in different functional areas of the memory and the found value of the IR inside the crystal part of the memory, the IR of the functional units are determined.

Table 4. Percentage of transistors

	Quantitative ratio of transistors of functional areas of memory (%)
	Storage device	Decoder	Local control unit

ROM, PROM

λ H RAM = 0.89*λ RAM;

λ N ROM, PROM = 0.607*λ ROM, PROM;

λ N RPZU = 0.75* λ RPZU,

where: λ N RAM, λ N ROM, PROM, λ N RPZU - IO drives RAM, ROM, PROM, RPZU, respectively.

8.3. Calculation of the failure rate of the functional units of the memory: decoders, address part, control circuits.

Using data on the ratio of the number of transistors in each part of the memory (Table 4), one can find the failure rates of the decoders, the address part, and the memory control circuits. Knowing the number of transistors in each part of the memory, you can find the failure rate of a group or individual transistors of the memory.

9. Calculation of the failure rate of functionally complete units of memory chips

The section presents an algorithm for calculating the IR of functionally complete units of memory chips. The technique is applicable to memory chips listed in the ACRN.

9.1. Initial data for calculating the failure rate

Below are the initial data necessary for calculating the IR of functionally complete nodes of memory chips. Under the functionally complete nodes of memory chips, we mean the drive, the address part, the control circuit. The technique also allows calculating the IR of parts of functional units, individual valves, transistors.

Initial data

Memory type: RAM, ROM, PROM, RPZU;

Information capacity of the memory;

Organization of RAM;

Manufacturing technology;

Power consumption;

Chip package type;

Quantity of conclusions of the case;

Thermal resistance crystal - case or crystal - environment;

Equipment operation group;

Increased operating ambient temperature;

Manufacturing quality level.

9.2. Algorithm for calculating the failure rate of memory circuits and functionally complete units of memory circuits

1. Determine the operational IS of the memory chip (λe p) using the initial data using one of the automated calculation programs: “ACRN”, “Asonika-K” or using the “Military HandBook 217F” standard.

2. Determine the values of the IE of the memory chip without the case λcr.

λkr zu \u003d C 1 π t π Q π L.

3. Calculation of the IS of the drive inside the crystal memory and IS of the functional units should be carried out in accordance with Section 8.2.

10. Calculation of the failure rate of functionally complete nodes of programmable logic integrated circuits and basic matrix crystals

Each FPGA family consists of a set of chip types of the same architecture. The crystal architecture is based on the use of the same functional nodes of several types. Chips of different ratings within the family differ from each other in the type of package and the number of functional nodes of each type: a configurable logic block, an input / output block, memory, JTAG, and the like.

It should be noted that in addition to configurable logical blocks and I/O blocks, each FPGA contains a matrix of keys that form links between FPGA elements. Given the fact that these areas are distributed evenly throughout the chip, in addition to the input / output blocks that are located on the periphery, we can assume that the key matrix is part of the configurable logical blocks and input / output blocks.

To calculate the failure rates of functional units, it is necessary to compose a system of linear equations. The system of equations is compiled for each FPGA family.

Each of the equations of the system is an equality, in the left part of which the value of the IE of the crystal is written for a specific chip rating from the selected family. The right side is the sum of the products of the number of functional nodes n of category i and the IO of these nodes λni .

Below is general form such a system of equations.

λ e a \u003d a 1 λ 1 + a 2 λ 2 + ... + a n λ n

λ e b \u003d b 1 λ 1 + b 2 λ 2 + ... + b n λ n

……………………………

λ e k \u003d k 1 λ 1 + k 2 λ 2 + ... + k n λ n

Where

λ e a , λ e b , … λ e k –– operational IS of microcircuits of the FPGA family (microcircuits a, b, …k, respectively),

a 1 , a 2 , …, a n is the number of functional nodes 1, 2, … n of the category in the microcircuit a, respectively,

b 1 , b 2 , …, b n is the number of functional units of category 1, 2, … n , in the chip c, respectively,

k 1 , k 2 , …, k n is the number of functional units of category 1, 2, … n , k in the microcircuit, respectively,

λ 1 , λ 2 , …, λ n are IO of functional units of category 1, 2, … n , respectively.

The values of operational IR microcircuits λ e a , λ e b , ... λ e k are calculated according to the ASRN, the number and type of functional units are given in the technical documentation for the FPGA (Data Sheet or in domestic periodicals).

The IO values of the functional nodes of the FPGA family λ 1 , λ 2 , …, λ n are found from the solution of the system of equations.

11. Checking the calculation results

Verification of the calculation results for a memory chip is performed by calculating the RI of a crystal of another memory chip using the obtained value of the RI of the memory cell and comparing the obtained value of the crystal RI with the RI value calculated using standard methods (ASRN, Asonika, etc.).

The calculation results for the FPGA are verified by calculating the RI of an FPGA crystal of another type from the same family using the found values of the IR of the FPGA functional units and comparing the obtained value of the FPGA RI with the RI value calculated using standard methods (ASRN, Asonica, etc.) .

12. An example of calculating the failure rates of FPGA functional units and checking the calculation results

12.1. Calculation of IE of functional units and outputs of FPGA cases

IO calculation was carried out on the example of FPGAs of the Spartan family, developed by Xilinx.

The Spartan family consists of 5 types of FPGAs, which include a matrix of configurable logic blocks, I/O blocks, boundary scan logic (JTAG).

The FPGAs in the Spartan family differ in the number of logic gates, the number of configurable logic blocks, the number of I/O blocks, package types, and the number of package pins.

Below is the calculation of the IO of configurable logic blocks, I/O blocks, JTAG for XCS 05XL, XCS 10XL, XCS 20XL FPGAs.

To check the obtained results, the operational IR of XCS 30XL FPGA is calculated. The obtained value of the IR of the XCS 30XL FPGA is compared with the value of the IR calculated using the ASRN. Also, to check the results obtained, the values of the RI of one output for different FPGA packages are compared.

12.1.1. Calculation of failure rates of functional units of FPGA XCS 05XL , XCS 10XL , XCS 20XL

In accordance with the above calculation algorithm, in order to calculate the IO of the functional units of the FPGA, it is necessary:

Compile a list and values of initial data for FPGA XCS 05XL, XCS 10XL, XCS 20XL, XCS 30XL;

Calculate operational FPGA IOsХСS 05XL, ХСS 10XL, ХСS 20XL, ХСS 30XL (calculation is carried out according to using raw data);

Compose a system of linear equations for FPGA crystals ХСS 05XL , ХСS 10XL , ХСS 20XL ;

Find a solution to the system of linear equations (the unknowns in the system of equations are the IO of functional units: configurable logic blocks, input-output blocks, boundary scanning logic);

Compare the RI values of the XCS 30XL FPGA crystal obtained in the previous paragraph with the crystal RI value obtained using the ASPH;

Compare output RI values for different packages;

Formulate a conclusion about the validity of the calculations;

When a satisfactory match of the failure rates is obtained (from 10% to 20%), stop the calculations;

If there is a large discrepancy between the calculation results, correct the initial data.

In accordance with the initial data for calculating the operational FPGA are: manufacturing technology, number of gates, power consumption, overheating temperature of the crystal relative to the environment, package type, number of package pins, thermal resistance of the chip-package, manufacturing quality level, equipment operation group in which the FPGA is used .

All initial data, except for power consumption, crystal overheating temperature and equipment operation group, are given in. The power consumption can be found either in the technical literature, or by calculation, or by measurement on the board. The overheating temperature of the crystal relative to the environment is found as the product of the power consumed and thermal resistance crystal-case. The equipment operation group is given in specifications to the hardware.

The initial data for calculating the operational failure rate of FPGA XCS 05XL, XCS 10XL, XCS 20XL, XCS 30XL are given in Table 5.

Table 5. Initial data

Initial	FPGA rating
Initial	XCS 05XL	XCS 10XL	XCS 20XL	XCS 30XL
Technology manufacturing
Maximum number of logs valves
Number of configurable logical blocks, N klb
Number of used inputs/outputs, N inputs/outputs
Type of shell	VQFP	TQFP	PQFP	PQFP
Number of case pins
Thermal resistance crystal-tall - case, 0 С/W
Manufacturing quality level	Commercial
Equipment operation group

To determine the overheating temperature of the crystal relative to the ambient temperature, it is necessary to find the power consumption for each microcircuit.

In most CMOS integrated circuits, almost all power dissipation is dynamic and is determined by the charging and discharging of internal and external load capacitances. Each pin in a chip dissipates power according to its capacitance, which is constant for each type of pin, and the frequency at which each pin switches may differ from the clock frequency of the chip. The total dynamic power is the sum of the powers dissipated at each pin. Thus, to calculate the power, you need to know the number of elements used in the FPGA. For the Spartan family, the values of the current consumption of the input / output blocks (12mA) are given at a load of 50 pF, a supply voltage of 3.3 and a maximum FPGA operating frequency of 80 MHz. Assuming that the power consumption of the FPGA is determined by the number of switching input / output blocks (as the most powerful consumers of energy), and due to the lack of experimental data on the power consumption, we estimate the power consumed by each FPGA, given that 50% of the input / output blocks are simultaneously switched at some fixed frequency (during the calculation, the frequency was chosen 5 times lower than the maximum).

Table 6 shows the power consumed by the FPGA and the overheating temperature of the crystals relative to the chip package.

Table 6. Power consumption of FPGA

XCS 05XL

XCS 10XL

XCS 20XL

XCS 30XL

Consumed

Power, W

Crystal overheating temperature, 0 С

Let us calculate the values of the coefficients in equation (1):

λ e \u003d (С 1 π t + С 2 π E) π Q π L

The coefficients π t, С 2 , π E , π Q , π L are calculated according to the ASRN. The coefficients C 1 are found using the approximation of the values of the coefficient C 1 given in the ASRN for FPGAs of different degrees of integration.

The values of the coefficient C 1 for the FPGA are shown in Table 7.

Table 7. Values of coefficient С 1

Number of gates in FPGA	Coefficient С 1
Up to 500	0,00085
From 501 to 1000	0,0017
From 2001 to 5000	0,0034
From 5001 to 20000	0,0068

Then for the maximum number of FPGA gatesХСS 05XL, ХСS 10XL, ХСS 20XL, ХСS 30XL we get the values of the coefficient C 1, 0.0034, 0.0048, 0.0068, 0.0078, respectively.

Coefficient values π t, C 2 , π E , π Q , π L XCS 05XL , XCS 10XL , XCS 20XL , XCS 30XL are shown in Table 8.

Table 8. FPGA IO Performance Values

Designation and name of coefficients	Coefficient values
Designation and name of coefficients	XCS 05XL	XCS 10XL	XCS 20XL	XCS 30XL
π t	0,231	0,225	0,231	0,222
From 2	0,04	0,06	0,089	0,104
π E
π Q
π L
Crystal failure rate,λcr = С 1 π t π Q π L *10 6 1/hour	0,0007854	0,0011	0,00157	0,0018
Core failure rate,λcorp \u003d C 2 π E π Q π L * 10 6 1 / hour			0,445	0,52
FPGA operational failure rateλe *10 6 1/hour	0,2007854	0,3011	0,44657	0,5218

Let's find the values of IO of configurable logic blocks λ clb, input/output blocksλ in/out and boundary scan logicλ JTAG for FPGA XCS 05XL , XCS 10XL , XCS 20XL . To do this, we compose a system of linear equations:* S 05 XL - IO of the crystal, the number of configurable logic blocks, the number of input / output blocks for the XCS 05XL FPGA, respectively;

λkr XC S 10 XL ,N klb XC S 10 XL , N input/output XC S 10 XL - chip IO, number of configurable logical blocks, number of input/output blocks for FPGA XCS 10XL , respectively;

λkr XC S 20 XL , N klb XC S 20 XL , N I/O XC S 20 XL - chip IO, number of configurable logical blocks, number of input/output blocks for FPGA XCS 20XL , respectively.

Substituting in the system of equations the values of the IO of the crystals, the number of configurable logic blocks and blocks of input / output, we get: 0.00157 * 10 -6 = 400 * λ clb + 160 * λ in / out + λ JTAG

The system of three linear equations with three unknowns has a unique solution:

λ klb \u003d 5.16 * 10 -13 1 / hour;λ in / out \u003d 7.58 * 10 -12 1 / hour; λ JTAG = 1.498*10 -10 1/hour.

12.1.2. Checking calculation results

To check the obtained solution, we calculate the IR of the FPGA crystal XC S 30 XL λcr XC S 30 XL , using the found valuesλ clb, λ in/out, λ JTAG .

By analogy with the equations of the systemλcr XC S 30 XL 1 is equal to:

λcr XC S 30 XL 1 = λ klb * N klb XC S 30 XL + λ in/out * N in/out XC S 30 XL + λ JTAG =

576* 5,16*10 -13 + 192*7,58*10 -12 + 1.498*10 -10 = 0.0019*10 -6 1/hour.

The value of RI of the crystal obtained using ASPH is (Table 9): 0.0018*10 -6 . The percentage of these values is: (λcr XC S 30 XL 1 - λcr XC S 30 XL )*100%/ λcr XC S 30 XL 1 ≈ 5%.

ROI of one output, obtained by dividing the ROI by the number of outputs in the packages for XC FPGAs S 05 XL , XR S 10 XL , XR S 20 XL , XR S 20 XL , are equal to 0.002*10 -6 , 0.00208*10 -6 , 0.0021*10 -6 , 0.0021*10 -6 , respectively, i.e. differ by no more than 5%.

The difference in the RI values, which is about 5%, is probably determined by the approximate values of the dissipation powers taken in the calculation, and, as a result, by the inaccurate values of the coefficientsπ t, as well as the presence of unaccounted for FPGA elements, information about which is not available in the documentation.

The appendix contains a block diagram for calculating and checking the failure rates of FPGA functional areas.

13. Conclusions

1. A methodology for assessing the IS of functional units of integrated circuits is proposed.

2. It allows you to calculate:

a) for memory circuits - IE of storage devices, memory cells, decoders, control circuits;

b) for microprocessors and microcontrollers - IO memory devices, registers, ADCs, DACs and functional blocks built on their basis;

c) for programmable logic integrated circuits - IO, blocks of various functional purposes included in them - configurable logic blocks, input / output blocks, memory cells, JTAG and functional blocks built on their basis.

3. A method for checking the calculated values of IE of functional units is proposed.

4. The application of the verification methodology, the calculated values of the IR of the functional units of integrated circuits, showed the adequacy of the proposed approach for assessing the IR.

Application

Block diagram for calculating the failure rate of FPGA functional units

Literature

Porter D.C, Finke W.A. Reability characterization an prediction of IC. PADS-TR-70, p.232.

Military Handbook 217F. “Reability prediction of electronic equipment”. Department of Defense, Washington, DC 20301.

“Automated system reliability calculation”, developed by the 22nd Central Research Institute of the Ministry of Defense of the Russian Federation with the participation of the RNII “Elektronstandart” and JSC “Standartelectro”, 2006.

“Semiconductor storage devices and their application”, V.P. Andreev, V.V. Baranov, N.V. Bekin and others; Edited by Gordonov. M. Radio and communication. 1981.-344pp.

Development prospects computer science: V. 11 book: Ref. allowance / Under the editorship of Yu.M. Smirnov. Book. 7: “Semiconductor storage devices”, A.B.Akinfiev, V.I.Mirontsev, G.D.Sofisky, V.V.Tsyrkin. - M .: Higher. school 1989. - 160 p.: ill.

“LSI circuit design of permanent memory devices”, O.A. Petrosyan, I.Ya.Kozyr, L.A.Koledov, Yu.I.Schetinin. – M.; Radio and communication, 1987, 304 p.

“Reliability of random access memory”, EVM, Leningrad, Energoizdat, 1987, 168 p.

TIIER, v.75, issue 9, 1987

Xilinx. The Programmable Logic. Date Book, 2008 g. http:www.xilinx.com.

"Sector of electronic components", Russia-2002-M.: Publishing house "Dodeka-XXI", 2002.

DS00049R-page 61  2001 Microchip Technology Inc .

TMS320VC5416 Fixed-Point Digital Signal Processor, Data Manual, Literature Number SPRS095K.

Company CD-ROM Integrated Device Technology.

CD-ROM from Holtec Semiconductor.

Failure rate- conditional density of the probability of failure of a non-recoverable object, determined for the considered moment of time, provided that up to this moment the failure has not occurred.

Thus, statistically, the failure rate is equal to the number of failures that occurred per unit of time, divided by the number of failures to present moment objects.

A typical change in the failure rate over time is shown in fig. 5.

The operating experience of complex systems shows that the change in the failure rate λ( t) of the majority of objects is described U- figurative curve.

Time can be conditionally divided into three characteristic areas: 1. Run-in period. 2. Period of normal use. 3. The aging period of the object.

Rice. 5. Typical change in failure rate

The run-in period of an object has an increased failure rate caused by run-in failures due to defects in production, installation and commissioning. Sometimes the end of this period is associated with the warranty service of the object, when the elimination of failures is carried out by the manufacturer. During normal operation, the failure rate remains practically constant, while failures are random in nature and appear suddenly, primarily due to random load changes, non-compliance with operating conditions, adverse external factors, etc. It is this period that corresponds to the main time of operation of the facility.

The increase in the failure rate refers to the aging period of the object and is caused by an increase in the number of failures due to wear, aging and other reasons associated with long-term operation. That is, the probability of failure of an element that survived for the moment t in some subsequent time interval depends on the values of λ( u) only on this interval, and therefore the failure rate is a local indicator of the reliability of the element on a given time interval.

Topic 1.3. Reliability of recoverable systems

Modern systems automation are complex recoverable systems. Such systems in the process of operation, in case of failure of some elements, are repaired and continue further work. The ability of systems to recover in the process of operation is "laid down" during their design and is ensured during manufacture, and repair and restoration operations are provided for in the regulatory and technical documentation.

Carrying out repair and restoration measures is essentially another way to improve the reliability of the system.

1.3.1. Reliability indicators of recoverable systems

On the quantitative side, such systems, in addition to the previously considered reliability indicators, are also characterized by complex reliability indicators.

A complex indicator of reliability is a reliability indicator that characterizes several properties that make up the reliability of an object.

Complex indicators of reliability, which are most widely used in characterizing the reliability of restored systems, are:

Availability factor;

Operational readiness factor;

Technical utilization factor.

Availability factor- the probability that the object will be in a working state at an arbitrary point in time, except for planned breaks, during which the use of the object for its intended purpose is not provided.

Thus, the readiness factor simultaneously characterizes two different properties of an object - reliability and maintainability.

Availability is important parameter, however, it is not universal.

Operational Readiness Ratio- the probability that the object will be in a working state at an arbitrary point in time, except for planned breaks, during which the use of the object for its intended purpose is not provided, and, starting from this moment, it will work without fail for a given time interval.

The coefficient characterizes the reliability of objects, the need for which arises at an arbitrary point in time, after which a certain uptime is required. Up to this point, the equipment can be in the standby mode, the application mode in other operating functions.

Technical utilization factor- the ratio of the mathematical expectation of the time intervals of the object being in a working state for a certain period of operation to the sum of the mathematical expectations of the time intervals of the object being in a working state, downtime due to maintenance, and repairs for the same period of operation.

When considering the laws of failure distribution, it was found that the failure rates of elements can be either constant or vary depending on the operating time. For long-term systems, which include all transportation systems, preventive maintenance is provided, which practically eliminates the effect of wear-out failures, so only sudden failures occur.

This greatly simplifies the reliability calculation. However, complex systems are made up of many elements connected in a different way. When the system is in operation, some of its elements operate continuously, others - only at certain intervals, others - perform only short switching or connection operations. Consequently, during a given period of time, only a part of the elements have an operating time that coincides with the operating time of the system, while others operate for a shorter time.

In this case, to calculate the operating time of a given system, only the time during which the element is turned on is considered; such an approach is possible if we assume that during periods when the elements are not included in the operation of the system, their failure rate is zero.

From the point of view of reliability, the most common scheme is the series connection of elements. In this case, the calculation uses the rule of product of reliabilities:

Where R(t i)- reliability i-th element that is included in t i hours of total system uptime t h.

The so-called

employment rate equal to

i.e., the ratio of the element's operating time to the system's operating time. The practical meaning of this coefficient is that for an element with a known failure rate, the failure rate in the system, taking into account the operating time, will be equal to

The same approach can be used in relation to the individual nodes of the system.

Another factor to consider when analyzing the reliability of a system is the level of workload with which the elements work in the system, as it largely determines the magnitude of the expected failure rate.

The failure rate of elements varies significantly even with small changes in the workload that affects them.

In this case, the main difficulty in the calculation is caused by a variety of factors that determine both the concept of element strength and the concept of load.

The strength of an element combines its resistance to mechanical loads, vibrations, pressure, acceleration, etc. The category of strength also includes resistance to thermal loads, electrical strength, moisture resistance, corrosion resistance and a number of other properties. Therefore, strength cannot be expressed by a certain numerical value, and there are no strength units that take into account all these factors. Loads are also varied. Therefore, to assess the strength and load, statistical methods are used, which determine the observed effect of element failure over time under the action of a series of loads or under the action of a predominant load.

The elements are designed so that they can withstand the rated loads. During the operation of elements under nominal loads, a certain regularity is observed in the intensity of their sudden failures. This rate is called the nominal rate of sudden failures of elements, and it is the initial value for determining the actual rate of sudden failures of a real element (taking into account the operating time and workload).

For a real element or system, three main environmental influences are currently considered: mechanical, thermal, and work loads.

The influence of mechanical influences is taken into account by the coefficient , the value of which is determined by the installation location of the equipment, and can be taken equal to:

for laboratories and comfortable premises - 1

, stationary ground installations - 10

, railway rolling stock - 30.

Nominal sudden failure rate, selected according to

tab. 3, should be increased by a factor depending on the installation location of the apparatus in operation.

Curves fig. 7 illustrate the general nature of the change in the intensity of sudden failures of electrical and electronic components depending on the heating temperature and the magnitude of the workload.

The intensity of sudden failures with an increase in the workload, as can be seen from the curves, increases according to the logarithmic law. These curves also show how the sudden failure rate of elements can be reduced even to a value below the nominal value. A significant reduction in the rate of sudden failures is achieved if the elements operate at loads below the nominal values.

Rice. 16

Rice. 7 can be used when carrying out approximate (educational) calculations of the reliability of any electrical and electronic components. The nominal mode in this case corresponds to a temperature of 80°C and 100% of the working load.

If the design parameters of the element differ from the nominal values, then according to the curves in Fig. 7, the increase for the selected parameters can be determined and the ratio by which the value of the failure rate of the element under consideration is multiplied is obtained.

High reliability can be built into the design of elements and systems. To do this, it is necessary to strive to reduce the temperature of the elements during operation and use elements with increased nominal parameters, which is tantamount to reducing workloads.

In any case, the increase in the cost of manufacturing the product pays off by reducing operating costs.

Failure rate for elements of electrical circuits
drink depending on the load can be defined as follows
or by empirical formulas. In particular, depending
on operating voltage and temperature

Table value at rated voltage and temperature t i .

- failure rate at operating voltage U 2 and temperature t2.

It is assumed that the mechanical effects remain at the same level. Depending on the type and type of elements, the value P, varies from 4 to 10, and the value TO within 1.02 1.15.

When determining the actual failure rate of elements, it is necessary to have a good idea of the expected load levels at which the elements will operate, to calculate the values of electrical and thermal parameters, taking into account transient conditions. The correct identification of loads acting on individual elements leads to a significant increase in the accuracy of reliability calculations.

When calculating the reliability taking into account wear failures, it is also necessary to take into account the operating condition. Durability values M, given in table. 3 as well as for nominal load and laboratory conditions. All elements operating in other conditions have a durability that differs from it by an amount TO Value TO can be taken equal to:

for the laboratory - 1.0

, ground installations - 0.3

, railway rolling stock - 0.17

Small coefficient fluctuations TO possible for equipment of various purposes.

To determine expected durability M it is necessary to multiply the average (nominal) life, determined from the table, by the coefficient TO .

In the absence of materials necessary to determine the failure rate depending on the load levels, a coefficient method for calculating the failure rate can be used.

The essence of the coefficient calculation method is that when calculating the equipment reliability criteria, coefficients are used that relate the failure rate of elements various types with the failure rate of the element, the reliability characteristics of which are reliably known.

It is assumed that the exponential law of reliability is valid, and the failure rates of elements of all types vary depending on the operating conditions to the same extent. The last assumption means that under various operating conditions, the relation

The failure rate of an element whose quantitative characteristics are known;

Reliability factor i-th element. An element with a failure rate ^ 0 is called the main element of the system calculation. When calculating coefficients K i the wire_unregulated resistance is taken as the main element of the system calculation. In this case, to calculate the reliability of the system, it is not required to know the failure rate of elements of all types. It is enough to know only the reliability coefficients K i, the number of elements in the circuit and the failure rate of the main element of the calculation Since K i has a spread of values, then the reliability is checked as for TO min , and for TO max. Values Ki, determined on the basis of analysis of data on failure rates, for equipment for various purposes are given in Table. 5.

Table 5

The failure rate of the main element of the calculation (in this case, the resistance) should be determined as the weighted average of the failure rates of the resistances used in the system being designed, i.e.

AND N R- failure rate and number of resistances i-th type and value;

T- number of types and ratings of resistances.

It is desirable to build the resulting dependence of the system reliability on the operating time as for the values TO min , so for TO max

Having information about the reliability of individual elements included in the system, it is possible to give a general assessment of the reliability of the system and determine the blocks and nodes that require further improvement. To do this, the system under study is divided into nodes according to a constructive or semantic feature (a block diagram is drawn up). For each selected node, reliability is determined (nodes with less reliability require refinement and improvement in the first place).

When comparing the reliability of nodes, and even more so various options systems, it should be remembered that the absolute value of reliability does not reflect the behavior of the system in operation and its efficiency. The same value of the system reliability can be achieved in one case due to the main elements, the repair and replacement of which requires considerable time and large material costs (for an electric locomotive, removal from train operation), in the other case, these are small elements, the replacement of which is carried out by the attendant. personnel without removing the machine from work. Therefore, for a comparative analysis of designed systems, it is recommended to compare the reliability of elements that are similar in their meaning and consequences arising from their failures.

When estimating reliability calculations, you can use the data of operating experience of similar systems. which to some extent takes into account the operating conditions. The calculation in this case can be carried out in two ways: by the average level of reliability of the same type of equipment or by the conversion factor to real operating conditions.

The calculation based on the average level of reliability is based on the assumption that the designed equipment and the operated sample are equal. This can be allowed with the same elements, similar systems and the same ratio of elements in the system.

The essence of the method is that

And - the number of elements and the time between failures of the equipment - sample;

And - the same designed equipment. From this ratio, it is easy to determine the time between failures for the designed equipment:

The advantage of the method is its simplicity. Disadvantages - the absence, as a rule, of a sample of the equipment in operation, suitable for comparison with the designed device.

The calculation by the second method is based on the determination of the conversion factor, which takes into account the operating conditions of similar equipment. To determine it, a similar system is selected, operated under specified conditions. Other requirements may not be met. For the selected operating system, reliability indicators are determined using the data in Table. 3, the same performance data are determined separately.

The conversion factor is defined as the ratio

- MTBF according to operation data;

T oz- time to failure by calculation.

For the designed equipment, the calculation of reliability indicators is carried out using the same tabular data as for the operating system. The results are then multiplied by K e.

Coefficient K e takes into account real operating conditions - preventive repairs and their quality, replacement of parts between repairs, qualifications of maintenance personnel, condition of depot equipment, etc., which cannot be foreseen with other calculation methods. Values K e may be more than one.

Any of the considered methods of calculation can be performed for a given reliability, i.e., by the opposite method - from the reliability of the system and the time between failures to the choice of indicators of the constituent elements.

1.1 Probability of failure-free operation

The probability of failure-free operation is the probability that under certain operating conditions, within a given operating time, no failure will occur.
The probability of failure-free operation is denoted as P(l) , which is determined by formula (1.1):

Where N 0 - number of elements at the beginning of the test;r(l) - the number of failures of elements by the time of operation.It should be noted that the larger the valueN 0 , the more accurately you can calculate the probabilityP(l).
At the beginning of operation of a serviceable locomotive P(0) = 1, since during the run l= 0, the probability that no element fails takes the maximum value - 1. With increasing mileage l probability P(l) will decrease. In the process of approaching the service life to an infinitely large value, the probability of failure-free operation will tend to zero P(l→∞) = 0. Thus, in the process of operating time, the value of the probability of no-failure operation varies from 1 to 0. The nature of the change in the probability of no-failure operation as a function of mileage is shown in fig. 1.1.

Fig.2.1. Graph of change in the probability of failure-free operation P(l) depending on work

The main advantages of using this indicator in calculations are two factors: firstly, the probability of failure-free operation covers all factors that affect the reliability of the elements, allowing you to simply judge its reliability, because. the larger the valueP(l), the higher the reliability; secondly, the probability of failure-free operation can be used in reliability calculations of complex systems consisting of more than one element.

1.2 Probability of failure

The probability of failure is the probability that, under certain operating conditions, at least one failure will occur within a given operating time.
The failure probability is denoted as Q(l), which is determined by formula (1.2):

At the beginning of operation of a serviceable locomotiveQ(0) = 0, since during the runl= 0 the probability that at least one element will fail takes the minimum value - 0. With increasing mileagelfailure probabilityQ(l) will increase. In the process of approaching the service life to an infinitely large value, the probability of failure will tend to unityQ(l→∞ ) = 1. Thus, in the process of operating time, the value of the probability of failure varies from 0 to 1. The nature of the change in the probability of failure in the run function is shown in fig. 1.2. The probability of failure-free operation and the probability of failure are opposite and incompatible events.

Fig.2.2. Graph of change in the probability of failure Q(l) depending on work

1.3 Failure rate

The failure rate is the ratio of the number of elements per unit of time or mileage, divided by the initial number of elements tested. In other words, the failure rate is an indicator that characterizes the rate of change in the probability of failures and the probability of failure-free operation as the duration of work increases.
The failure rate is denoted as and is determined by formula (1.3):

where is the number of failed elements for the run interval.
This indicator allows you to judge by its value the number of elements that will fail at some period of time or mileage, and also by its value you can calculate the number of required spare parts.
The nature of the change in the frequency of failures in the mileage function is shown in fig. 1.3.

Rice. 1.3. Graph of change in the frequency of failures depending on the operating time

1.4 Failure rate

The failure rate is the conditional density of the occurrence of an object failure, determined for the considered point in time or operating time, provided that up to this point the failure has not occurred. Otherwise, the failure rate is the ratio of the number of failed elements per unit of time or mileage to the number of properly working elements in a given period of time.
The failure rate is denoted as and is determined by formula (1.4):

Where

Typically, the failure rate is a non-decreasing function of time. The failure rate is usually used to assess the propensity for failures at various points in the operation of objects.
On fig. 1.4. the theoretical nature of the change in the failure rate as a function of the run is presented.

Rice. 1.4. Graph of change in the failure rate depending on the operating time

On the graph of the change in the failure rate, shown in fig. 1.4. It is possible to single out three main stages reflecting the process of operation of an element or an object as a whole.
The first stage, also called the burn-in stage, is characterized by an increase in the failure rate during the initial period of operation. The reason for the increase in the failure rate on this stage are hidden manufacturing defects.
Second stage or period normal operation, is characterized by the tendency of the failure rate to a constant value. During this period, random failures may occur due to the appearance of a sudden load concentration that exceeds the ultimate strength of the element.
The third stage, the so-called period of forced aging. It is characterized by the occurrence of wear failures. Further operation of the element without its replacement becomes economically unsustainable.

1.5 Mean time to failure

Mean time to failure is the average mileage between failures of an element before failure.
The mean time to failure is denoted as L 1 and is determined by formula (1.5):

Where l i- time to failure of the element; r i- number of failures.
The mean time to failure can be used to preliminarily determine the timing of the repair or replacement of the element.

1.6 Mean value of the failure rate parameter

The average value of the failure flow parameter characterizes the average density of the probability of an object failure occurring, determined for the considered moment of time.
The average value of the failure rate parameter is denoted as W Wed and is determined by formula (1.6):

1.7 Example of calculation of reliability indicators

Initial data.
During the run from 0 to 600 thousand km, in the locomotive depot, information was collected on TED failures. At the same time, the number of serviceable TEDs at the beginning of the operation period was N0 = 180 pcs. The total number of failed TEDs for the analyzed period was ∑r(600000) = 60. Take the run interval equal to 100 thousand km. At the same time, the number of failed TEDs for each section was: 2, 12, 16, 10, 14, 6.

Required.
It is necessary to calculate the reliability indicators and build their dependencies of change over time.

First you need to fill in the table of initial data as shown in Table. 1.1.

Table 1.1.

Initial data for calculation

, thousand km	0 - 100	100 - 200	200 - 300	300 - 400	400 - 500	500 - 600
	2	12	16	10	14	6
	2	14	30	40	54	60

Initially, using equation (1.1), we determine for each section of the run the value of the probability of failure-free operation. So, for the section from 0 to 100 and from 100 to 200 thousand km. mileage, the probability of failure-free operation will be:

Let's calculate the failure rate according to equation (1.3).

Then the failure rate in the section 0-100 thousand km. will be equal to:

Similarly, we determine the value of the failure rate for the interval of 100-200 thousand km.

Using equations (1.5 and 1.6), we determine the average time to failure and the average value of the failure rate parameter.

We systematize the results of the calculation and present them in the form of a table (Table 1.2.).

Table 1.2.

The results of the calculation of reliability indicators

, thousand km	0 - 100	100 - 200	200 - 300	300 - 400	400 - 500	500 - 600
	2	12	16	10	14	6
	2	14	30	40	54	60
P(l)	0,989	0,922	0,833	0,778	0,7	0,667
Q(l)	0,011	0,078	0,167	0,222	0,3	0,333
10 -7 , 1/km	1,111	6,667	8,889	5,556	7,778	3,333
10 -7 , 1/km	1,117	6,977	10,127	6,897	10,526	4,878

Let us present the nature of the change in the probability of failure-free operation of the TED depending on the run (Fig. 1.5.). It should be noted that the first point on the graph, i.e. with a run equal to 0, the value of the probability of no-failure operation will take the maximum value - 1.

Rice. 1.5. Graph of the change in the probability of failure-free operation depending on the operating time

Let us present the nature of the change in the probability of TEM failure depending on the run (Fig. 1.6.). It should be noted that the first point on the graph, i.e. with a run equal to 0, the value of the probability of failure will take the minimum value - 0.

Rice. 1.6. Graph of the change in the probability of failure depending on the operating time

We present the nature of the change in the frequency of failures of TED depending on the run (Fig. 1.7.).

Rice. 1.7. Graph of change in the frequency of failures depending on the operating time

On fig. 1.8. the dependence of the change in the intensity of failures on the operating time is presented.

Rice. 1.8. Graph of change in the failure rate depending on the operating time

2.1 Exponential law of distribution of random variables

The exponential law quite accurately describes the reliability of nodes in case of sudden failures of a random nature. Attempts to apply it to other types and cases of failures, especially gradual failures caused by wear and changes in the physicochemical properties of elements, have shown its insufficient acceptability.

Initial data.
As a result of testing ten high-pressure fuel pumps, their operating times to failure were obtained: 400, 440, 500, 600, 670, 700, 800, 1200, 1600, 1800 hours. Assuming that the operating time to failure of fuel pumps obeys an exponential distribution law.

Required.
Estimate the magnitude of the failure rate, as well as calculate the probability of failure-free operation for the first 500 hours and the probability of failure in the time interval between 800 and 900 hours of diesel operation.

First, let's determine the value of the average time of fuel pumps to failure according to the equation:

Then we calculate the value of the failure rate:

The value of the probability of no-failure operation of fuel pumps with an operating time of 500 hours will be:

The probability of failure between 800 and 900 hours of pump operation will be:

2.2 Weibull-Gnedenko distribution law

The Weibull-Gnedenko distribution law has become widespread and is used in relation to systems consisting of rows of elements connected in series from the point of view of ensuring the reliability of the system. For example, systems servicing a diesel generator set: lubrication, cooling, fuel supply, air supply, etc.

Initial data.
The idle time of diesel locomotives in unscheduled repairs due to the fault of auxiliary equipment obeys the Weibull-Gnedenko distribution law with parameters b=2 and a=46.

Required.
It is necessary to determine the probability of diesel locomotives exiting unscheduled repairs after 24 hours of downtime and the downtime during which the performance will be restored with a probability of 0.95.

Let's find the probability of restoring the locomotive's performance after it has been idle in the depot for a day according to the equation:

To determine the recovery time of the locomotive with a given value of confidence probability, we also use the expression:

2.3 Rayleigh's distribution law

The Rayleigh distribution law is mainly used to analyze the operation of elements that have a pronounced effect of aging (electrical equipment elements, various kinds of seals, washers, gaskets made of rubber or synthetic materials).

Initial data.
It is known that the operating time of contactors to failure in terms of coil insulation aging parameters can be described by the Rayleigh distribution function with the parameter S = 260 thousand km.

Required.
For an operating time of 120 thousand km. it is necessary to determine the probability of failure-free operation, the failure rate and the average time to the first failure of the electromagnetic contactor coil.

3.1 Basic connection of elements

A system consisting of several independent elements connected functionally in such a way that the failure of any of them causes a failure of the system is displayed by the calculated structural diagram of the failure-free operation with series-connected events of the failure-free operation of the elements.

Initial data.
The non-redundant system consists of 5 elements. Their failure rates are respectively 0.00007; 0.00005; 0.00004; 0.00006; 0.00004 h-1

Required.
It is necessary to determine the reliability indicators of the system: failure rate, mean time to failure, probability of failure-free operation, failure rate. Obtain reliability indicators P(l) and a(l) in the range from 0 to 1000 hours with a step of 100 hours.

We calculate the failure rate and mean time to failure using the following equations:

The values of the probability of failure-free operation and the frequency of failures will be obtained using the equations reduced to the form:

Calculation results P(l) And a(l) in the interval from 0 to 1000 hours of operation we will present in the form of a table. 3.1.

Table 3.1.

The results of calculating the probability of failure-free operation and the frequency of failures of the system in the time interval from 0 to 1000 hours.

l, hour	P(l)	a(l), hour -1
0	1	0,00026
100	0,974355	0,000253
200	0,949329	0,000247
300	0,924964	0,00024
400	0,901225	0,000234
500	0,878095	0,000228
600	0,855559	0,000222
700	0,833601	0,000217
800	0,812207	0,000211
900	0,791362	0,000206
1000	0,771052	0,0002

Graphic illustration P(l) And a(l) in the section to the average time to failure is shown in Fig. 3.1, 3.2.

Rice. 3.1. Probability of failure-free operation of the system.

Rice. 3.2. System failure rate.

3.2 Redundant connection of elements

Initial data.
On fig. 3.3 and 3.4 show two block diagrams for connecting elements: general (Fig. 3.3) and element-by-element redundancy (Fig. 3.4). The probabilities of failure-free operation of the elements are respectively equal to P1(l) = P ’1(l) = 0.95; P2(l) = P'2(l) = 0.9; P3(l) = P '3(l) = 0.85.

Rice. 3.3. Diagram of a system with general redundancy.

Rice. 3.4. Scheme of a system with element-by-element redundancy.

The probability of failure-free operation of a block of three elements without redundancy is calculated by the expression:

The probability of failure-free operation of the same system with total redundancy (Fig. 3.3) will be:

The probabilities of failure-free operation of each of the three blocks with element-by-element redundancy (Fig. 3.4) will be equal to:

The probability of failure-free operation of the system with element-by-element redundancy will be:

Thus, element-by-element redundancy gives a more significant increase in reliability (the probability of failure-free operation increased from 0.925 to 0.965, i.e. by 4%).

Initial data.
On fig. 3.5 shows a system with a combined connection of elements. In this case, the probabilities of failure-free operation of the elements have the following values: P1=0.8; P2=0.9; P3=0.95; P4=0.97.

Required.
It is necessary to determine the reliability of the system. It is also necessary to determine the reliability of the same system, provided that there are no redundant elements.

Fig.3.5. Scheme of the system with the combined functioning of elements.

For the calculation in the original system, it is necessary to select the main blocks. There are three of them in the presented system (Fig. 3.6). Next, we calculate the reliability of each unit separately, and then find the reliability of the entire system.

Rice. 3.6. Blocked scheme.

The reliability of the system without redundancy will be:

Thus, a non-redundant system is 28% less reliable than a redundant system.

At the stage of estimating and approximate calculations of electrical devices, the main reliability indicators are calculated .

The main qualitative indicators of reliability are:

Failure rate

Average time to failure.

Failure rate l (t) is the number of failures n(t) elements of the device per unit of time, referred to the average total number of elements N(t), operable at the time Δ t[ 9]

l (t)=n(t)/(Nt*Δt) ,

Where Δt- a given period of time.

For example: 1000 elements of the device worked 500 hours. During this time, 2 elements failed. From here,

l (t)=n(t)/(Nt*Δt)=2/(1000*500)=4*10 -6 1/h, that is, 4 elements out of a million can fail in 1 hour.

Failure rates l (t) elements are reference data, Appendix D shows the failure rates l (t) for elements commonly used in circuits.

An electrical device consists of a large number of components, therefore, the operational failure rate l is determined (t) of the entire device as the sum of the failure rates of all elements, according to the formula [11]

where k is a correction factor that takes into account the relative change in the average failure rate of elements depending on the purpose of the device;

m is the total number of groups of elements;

n і is the number of elements in the i-th group with the same failure rate l і (t) .

Probability of uptime P(t) is the probability that within a specified period of time t, the device will not fail. This indicator is determined by the ratio of the number of devices that have worked flawlessly up to the point in time t to the total number of devices that are operational at the initial moment.

For example, the probability of failure-free operation P(t)=0.9 represents the probability that within the specified time period t= 500h, a failure will occur in (10-9=1) one device out of ten, and out of 10 devices 9 will work without failures.

Probability of uptime P(t)=0.8 is the probability that within the specified time period t=1000h, 2 out of 100 devices will fail, and 80 out of 100 devices will operate without failure.

Probability of uptime P(t)=0.975 represents the probability that within the specified time period t=2500h, 1000-975=25 devices out of a thousand will fail, and 975 devices will operate without failure.

Quantitatively, the reliability of a device is estimated as the probability P(t) of the event that the device will perform its functions without fail during the time from 0 to t. The value P(t) is the probability of no-failure (the calculated value of P(t) should not be less than 0.85) operation is determined by the expression

where t is the operating time of the system, h (t is selected from the range: 1000, 2000, 4000, 8000, 10000 h);

λ – device failure rate, 1/h;

T 0 - time to failure, h.

The reliability calculation consists in finding the total failure rate λ of the device and the time between failures:

The recovery time of a device in case of failure includes the time for finding a failed element, the time for its replacement or repair, and the time for checking the device's operability.

The average recovery time T in electrical devices can be selected from the range of 1, 2, 4, 6, 8, 10, 12, 18, 24, 36, 48 hours. Smaller values correspond to devices with high maintainability. The average recovery time T in can be reduced using built-in control or self-diagnosis, modular design of components, affordable installation.

The value of the availability factor is determined by the formula

where T 0 - time to failure, h.

Tv is the average recovery time, h.

The reliability of elements depends to a large extent on their electrical and temperature conditions work. To increase the reliability, the elements must be used in light modes, determined by the load factors.

Load factor - this is the ratio of the design parameter of the element in the operating mode to its maximum allowable value. The load factors of different elements can vary greatly.

When calculating the reliability of the device, all elements of the system are divided into groups of elements of the same type and the same load factors K n.

The failure rate of the ith element is determined by the formula

(10.3)

where K n i - load factor, calculated in the maps of operating modes, or set assuming that the element is operating in normal modes, in Appendix D, the values of the load factors of the elements are given;

λ 0і - basic failure rate of the ith element is given in Appendix D.

Often, for reliability calculation, failure rate data λ 0i of analogues of elements are used.

Device Reliability Calculation Example consisting of a purchased complex BT-85W of imported production and a power source being developed on the element base of serial production.

The failure rate of imported products is determined as the reciprocal of the operating time (sometimes the warranty period for servicing the product is taken) based on the operation of a certain number of hours in one day.

The warranty period of the purchased imported product is 5 years, the product will work 14.24 hours a day:

T \u003d 14.24 hours x 365 days x 5 years \u003d 25981 hours - time between failures.

10 -6 1/hour - failure rate.

Calculations and initial data are performed on a computer using Excel programs and are given in tables 10.1 and 10.2. An example of calculation is given in table 10.1.

Table 10.1 - Calculation of system reliability

Name and type of element or analogue	Coefficient, load, K n i
λ i *10 -6 , 1 / h	λ i K n i 10 -6 1 / h	Number n i ,	n і λ i 10 -6 , 1 / h
Complex BT-85W	1,00	38,4897	38,4897		38,4897
Capacitor K53	0,60	0,0200	0,0120		0,0960
Socket (plug)SNP268	0,60	0,0500	0,0300		0,0900
TRS chip	0,50	0,0460	0,0230		0,0230
OMLT resistor	0,60	0,0200	0,0120		0,0120
Fusible insert VP1-1	0,30	0,1040	0,0312		0,0312
Zener diode 12V	0,50	0,4050	0,2500		0,4050
Indicator 3L341G	0,20	0,3375	0,0675		0,0675
push button switch	0,30	0,0100	0, 0030		0,0030
Photodiode	0,50	0,0172	0,0086		0,0086
Connection by welding	0,40	0,0001	0,0004		0,0004
Wire, m	0,20	0,0100	0,0020	0,2	0,0004
Solder connection	0,50	0,0030	0,0015		0,0045
l whole device					å=39.2313

Determine the overall failure rate of the device

Then the time between failures according to expression (10.2) and, respectively, is equal to

To determine the probability of failure-free operation for a certain period of time, we construct a dependency graph:

Table 10.2 - Calculation of the probability of failure-free operation

t(hour)
P(t)	0,97	0,9	0,8	0,55	0,74	0,65	0,52	0,4	0,34

A plot of the probability of failure-free operation against the operating time is shown in Figure 10.1.

Figure 10.1 - Probability of no-failure operation from the operating time

For the device, as a rule, the probability of failure-free operation is set from 0.82 to 0.95. According to the graph of Figure 10.1, we can determine for the developed device with a given probability of failure-free operation P(t)=0.82, the time between failures T o =5000 hours.

The calculation is made for the case when the failure of any element leads to the failure of the entire system as a whole, such a connection of elements is called logically sequential or basic. Reliability can be improved by redundancy.

For example. Element technology ensures the average failure rate of elementary parts l i \u003d 1 * 10 -5 1 / h . When used in a device N=1*10 4 elementary parts total failure rate l o \u003d N * li \u003d 10 -1 1 / h . Then the average uptime of the device To=1/lo=10 h. If you make a device based on 4 identical devices connected in parallel, then the average uptime will increase by N / 4 \u003d 2500 times and will be 25,000 hours or 34 months or about 3 years.

The formulas allow you to calculate the reliability of the device, if the initial data are known - the composition of the device, the mode and conditions of its operation, the failure rate of its elements.