Enterprise Data Acquisition Backend (EDAB) Technical Primer

Enterprise Data Acquisition Backend (EDAB) refers to enterprise grade data acquisition infrastructure comprising of Data Triangulation, IP Address Rotation,  Business Intelligence and Geographic IP allocation services.

Corporations employ EDAB to remain competitive in business intelligence and data warehousing arena without getting involved with complicated setup procedures. Simplicity is the key, operation is transparent from end-user perspective.

 

EDAB Branches

I. DATA TRIANGULATION

Many organizations employ price gouging or data distortion features based on the locality of their visitors.

Data triangulation is a technique that facilitates validation of data through cross-verification from two or more sources.

EDAB facilitates data triangulation through geographic distribution of IP addresses that would allow end-user to examine and verify destination contents from various localities. An entity can be in Boston, Texas, London, Johannesburg or Sydney at the same moment to validate the contents of acquired data and detect anomalies in presentation.

Example: ACME Company offers coffee mug for $5 from their website.

A user accesses their website from USA, and the page presented with the coffee mug lists $5 as the price.

Another user accesses their website from China, but this time the page presented lists $1 as the price for the coffee mug.

If we cross-validate this price from 2 or more localities through EDAB data triangulation service, we can see that ACME Company is engaging in price gouging depending on the locality, and preferably GDP per capita level of the visitor.

If ACME Company is your competitor, this gives valuable insight into their price markup ranges. If we factor in shipping, handling, local taxes,  etc. we can very easily find out their cost for the product offered.

 

II. IP ADDRESS ROTATION

In networking, IP rotation service refers to changing static IP (Internet Protocol) addresses of internet daemons, services or servers at random or predetermined intervals.

First, a primer on the founding blocks of IP rotation -- IP addresses.

 

What is an IP Address?

An Internet Protocol address (IP address) is a numerical label that is assigned to devices participating in a computer network that uses the Internet Protocol for communication between its nodes. An IP address serves two principal functions: host or network interface identification and location addressing. Its role has been characterized as follows: "A name indicates what we seek. An address indicates where it is. A route indicates how to get there." [1]

The designers of TCP/IP defined an IP address as a 32-bit number and this system, known as Internet Protocol Version 4 (IPv4), is still in use today. However, due to the enormous growth of the Internet and the predicted depletion of available addresses, a new addressing system (IPv6), using 128 bits for the address, was developed in 1995 [2] , standardized by RFC 2460 in 1998, [3] and is in world-wide production deployment.

IP is used to route data packets between networks; IP addresses specify the locations of the source and destination nodes in the topology of the routing system.

In short, IP addresses provide the backbone of the internet we know of.

 

How is an IP address assigned?

IP addresses are assigned to a machine either at the time of booting, usually from local network servers or ISP (dynamic IP), or permanently by fixed configuration of its hardware or software (static IP).

Each machine connected to internet must have its unique IP address to be able to communicate with other computers in order to avoid packet collusions. For most users, this IP address is provided dynamically by dial-up or DSL internet service provider from its IP pool and it changes at every disconnect from the network or when the machine is powered off.

Servers (particularly DNS) or internet hosts on the other hand, need to have static or fixed IP addresses to be able to service requests and communicate with other internet hosts. IP addresses are central part of security management and access control implementations.

 

What is IP Address Rotation?

IP Rotation is the process of distributing allocated IPs to a resource randomly or in a configurable manner specified by the administrator.

When a DSL user connects to his ISP, he is assigned an IP address from a pool of available IPs in his ISP's network topology. His internet address becomes whichever IP was allocated to him. If a disconnection occurs, the ISP will allocate next available IP from the available IP pool implementing IP address rotation transparently to the user.

Internet facing daemons on many hosts already implement automatic IP address rotation for incoming traffic. For example, a DNS server might change the IP address of a web server in a round-robin fashion to facilitate load-balancing of incoming traffic or equal distribution of resources among role-based access control lists. This method is commonly employed by large datacenters and organizations.

Real use of IP rotation can be observed for outgoing internet traffic. Since source IP is the foundation of access controls by destination firewalls in all forms of internet communications; by rotating IP addresses, a server, host or service can evade all restrictions put in place.

 

IP Address Rotation Methods

EDAB utilizes 4 main strategies in IP Rotation implementations.

  • Pre-configured IPs: IP rotation takes place at minutely intervals. Every minute or specifiable interval of time, a new IP is assigned.
  • Random IPs: Each connection initiated is assigned a randomly rotating IP.
  • Burst IPs: IP addresses are rotated as per specified number of hits. If 10 connections are initiated, 11th will be from a different IP.
  • Specific IPs: Originating source can choose which IP address to use for the outgoing connection.

 

IP Address Rotation illegitimate uses

Perhaps the most widespread abuse of IP address rotation is from spammers. Almost all spam farms employ some method of IP rotation to fool destination mail servers into believing that email connections are coming from different net blocks. The aim is to be able to deliver as many emails as possible so that their return rate on targeted product is high.

Spammers also deploy rotating IPs in their link exchange farms for rouge SEO firms called BlackHat SEOs. The target is to deceive major Search Engines for better pageranks.

There are various DDOS scenarios possible for when IPs are rotated. However, perpetrators of these attacks mostly prefer botnets.

 

IP Address Rotation legitimate uses

Google, Yahoo, Bing, Amazon all deploy IP rotation farms for their outgoing bots in order to distribute the load of their networks.

Anti-counterfeiting and anti-piracy agencies deploy IP rotation methods for data harvesting or for researching questionable content

Business intelligence companies use IP rotation to harvest, retrieval, scrape or mine data for performance metrics and data analytics.

Quantative and qualitative research companies deploy IP rotation to observe variations.

Data triangulation companies use IP rotation to verify the validity of their content.

Data warehouses use IP rotation to access a wider selection from their destinations.

Corporate firms use IP rotation to eliminate price gouging and geotargeted presentation.

SEO companies use IP rotation to check keyword rankings from different localities.

 

Despite having a high potential for abuse, legal uses of IP rotation far outweighs its negative sides. IP address rotation solutions form the backend of EDAB operations.

We at X5 Networks believe proper community vigilance is the key in having a better working environment for us and for our users.

 

III. BUSINESS INTELLIGENCE

Business intelligence refers to techniques used in spotting, digging-out, and analyzing business data, such as list of products and their costs. They provide historical, current, and predictive views of business activity. They are crucial to a company's success when applied properly since they provide 360 degree view of competitors' operations. Online analytical processing, quantative and qualitative analysis, data mining, data harvesting are all parts of business intelligence. Competitors would employ measures to minimize access to their otherwise public data.

EDAB deploys quantative and qualitative analysis methods, performance metrics and data warehousing to assist organizations in business intelligence field.

 

EDAB Implementation

Client side integration to EDAB has various options:

- Direct proxy access (HTTP/ Socks).

- Colocated server option in datacenter.

- VPN option for tunnelling where colocation is not possible.

 

 

EDAB History

1998: EDAB simplifies IP Address Rotation services by combining them to a single unified service.

1999: EDAB implements Business Intelligence and data warehousing services.

2001: EDAB implements Geographic IP and locality option.

2003: EDAB implements Data Triangulation service.

2006: EDAB reaches 12 localities around the globe.

2010: EDAB expands to 24 data centers.

 

 

References

 

 

This article is published under Creative Commons Attribution-ShareAlike License