Introduction to parallel processing
In computers, parallel processing is the processing of program instructions by dividing them among multiple processors with the objective of running a program in less time. In the earliest computers, only one program ran at a time. A computation-intensive program that took one hour to run and a tape copying program that took one hour to run would take a total of two hours to run. An early form of parallel processing allowed the interleaved execution of both programs together. The computer would start an I/O operation, and while it was waiting for the operation to complete, it would execute the processor-intensive program. The total execution time for the two jobs would be a little over one hour.
The next improvement was multiprocessing. In a multi-programming system, multiple programs submitted by users were each allowed to use the processor for a short time. To users it appeared that all of the programs were executing at the same time. Problems of resource contention first arose in these systems. Explicit requests for resources led to the problem of the deadlock. Competition for resources on machines with no tie-breaking instructions lead to the critical section region. Vector processing was another attempt to increase performance by doing more than one thing at a time. In this case, capabilities were added to machines to allow a single instruction to add (or subtract, or multiply, or otherwise manipulate) two arrays of numbers. This was valuable in certain engineering applications where data naturally occurred in the form of vectors or matrices. In applications with less well-formed data, vector processing was not so valuable. The next step in parallel processing was the introduction of multiprocessing. In these systems, two or more processors shared the work to be done. The earliest versions had a master/slave configuration. One processor (the master) was programmed to be responsible for all of the work in the system; the other (the slave) performed only those tasks it was assigned by the master. This arrangement was necessary because it was not then understood how to program the machines so they could cooperate in managing the resources of the system.
UNI-PROCESSOR SYSTEMS
From 16 32 bit
registers one is going to be the PC.
CPU consists of ALU
with Floating point accelerator and Diagnostic memory
Both main memory and
local memory gets directly connected to a common bus SBI
Parallelism and
pipelining within the CPU
ALU contains parallel
adders using the carry look ahead and carry-save.
High speed multiplier
recoding and convergence division techniques are used to achieve parallelism
Various phases of
instruction executions are now pipelined such as Instruction fetch,decode, operand prefetch, arithmetic logic
execution, and store results.
ALU contains parallel
adders using the carry look ahead and carry-save.
High speed multiplier
recoding and convergence division techniques are used to achieve parallelism
Various phases of
instruction executions are now pipelined such as Instruction fetch,decode, operand prefetch, arithmetic logic
execution, and store results.
No comments:
Post a Comment