ⓘ Data processing is, generally, the collection and manipulation of items of data to produce meaningful information. In this sense it can be considered a subset o ..


ⓘ Data processing

Data processing is, generally, "the collection and manipulation of items of data to produce meaningful information." In this sense it can be considered a subset of information processing, "the change of information in any manner detectable by an observer."

The term Data Processing DP has also been used to refer to a department within an organization responsible for the operation of data processing applications.


1. Data processing functions

Data processing may involve various processes, including:

  • Sorting – "arranging items in some sequence and/or in different sets."
  • Classification – separation of data into various categories.
  • Validation – Ensuring that supplied data is correct and relevant.
  • Summarization – reducing detailed data to its main points.
  • Aggregation – combining multiple pieces of data.
  • Reporting – list detail or summary data or computed information.
  • Analysis – the "collection, organization, analysis, interpretation and presentation of data."

2.1. History Manual data processing

Although widespread use of the term data processing dates only from the nineteen-fifties, data processing functions have been performed manually for millennia. For example, bookkeeping involves functions such as posting transactions and producing reports like the balance sheet and the cash flow statement. Completely manual methods were augmented by the application of mechanical or electronic calculators. A person whose job was to perform calculations manually or using a calculator was called a "computer."

The 1890 United States Census schedule was the first to gather data by individual rather than household. A number of questions could be answered by making a check in the appropriate box on the form. From 1850 through 1880 the Census Bureau employed "a system of tallying, which, by reason of the increasing number of combinations of classifications required, became increasingly complex. Only a limited number of combinations could be recorded in one tally, so it was necessary to handle the schedules 5 or 6 times, for as many independent tallies." "It took over 7 years to publish the results of the 1880 census" using manual processing methods.


2.2. History Automatic data processing

The term automatic data processing was applied to operations performed by means of unit record equipment, such as Herman Holleriths application of punched card equipment for the 1890 United States Census. "Using Holleriths punchcard equipment, the Census Office was able to complete tabulating most of the 1890 census data in 2 to 3 years, compared with 7 to 8 years for the 1880 census. It is estimated that using Holleriths system saved some $5 million in processing costs" in 1890 dollars even though there were twice as many questions as in 1880.


2.3. History Electronic data processing

Computerized data processing, or Electronic data processing represents a later development, with a computer used instead of several independent pieces of equipment. The Census Bureau first made limited use of electronic computers for the 1950 United States Census, using a UNIVAC I system, delivered in 1952.


2.4. History Other developments

The term data processing has mostly been subsumed by the more general term information technology IT. The older term "data processing" is suggestive of older technologies. For example, in 1996 the Data Processing Management Association DPMA changed its name to the Association of Information Technology Professionals. Nevertheless, the terms are approximately synonymous.


3. Applications

Commercial data processing

Commercial data processing involves a large volume of input data, relatively few computational operations, and a large volume of output. For example, an insurance company needs to keep records on tens or hundreds of thousands of policies, print and mail bills, and receive and post payments.

Data analysis

In science and engineering, the terms data processing and information systems are considered too broad, and the term data processing is typically used for the initial stage followed by a data analysis in the second stage of the overall data handling.

Data analysis uses specialized algorithms and statistical calculations that are less often observed in a typical general business environment. For data analysis, software suites like SPSS or SAS, or their free counterparts such as DAP, gretl or PSPP are often used.

  • A data processing system is a combination of machines, people, and processes that for a set of inputs produces a defined set of outputs. The inputs and
  • Electronic data processing EDP can refer to the use of automated methods to process commercial data Typically, this uses relatively simple, repetitive
  • multivariate processing of chemical data chemometrics Data cleansing Data editing Data reduction Data wrangling Pyle, D., 1999. Data Preparation for Data Mining
  • Data processing most often refers to Electronic data processing computer processes that convert data into information or knowledge. Data processing may
  • The United States Navy occupational rating of data processing technician abbreviated as DP was a designation given by the Bureau of Naval Personnel
  • A data processing unit DPU is a programmable electronic component that processes streams of data The data is transmitted to and from the component as
  • The Certificate in Data Processing CDP was a certification administered by the Data Processing Management Association. The CDP required several years
  • Automatic Data Processing Inc., commonly known as ADP, is an American provider of human resources management software and services. Until 2014, ADP was
  • professionals for data mining, online analytical processing market research and decision support. However, the means to retrieve and analyze data to extract
  • Data processing commonly occurs by stages, and the processed data from one stage may be considered the raw data of the next stage. Field data is