.net - Approaches to fast visualization of huge amount of data -

May 15, 2012

I want to summarize a large (and dynamically increasing) amount of data for visualization in graph view I am looking for ways.

Say I have a binary file with timestamp-value pairs, this file is increasing in realtime and can easily be more than many gigabytes.

There are several scenes that display this data in the form of a graph / plot, since in most cases we have more data than pixels on our X-axis, therefore, Have to be compressed. Each view may require a different resolution depending on its size on the screen. Apart from this, zooming in and out will cause quick changes in that resolution.

The current algorithm divides the data into squares of the same length and calculates the minimum and maximum value. Now for each pixel on our X-axis, we want to draw a vertical line from the minimum to the maximum value. In this way we can ensure that the outliers are not remembered. (Which is a necessity)

Every time a new resolution is necessary, we have to select the second segment length and go to the complete file, which is the slow path.

In order to create a variety of caching layers, we have to allow the data of different resolutions (continuous time) to be made. Unfortunately, I do not know how to implement such a cache in a way that still shows us the outlier.

Do you have any signs or know the literature showing the attitude of this type of problem?

Environment Microsoft. There is nothing but this will not make any difference because it is about general ideas.

Thanks in advance.

My attribute data must be stored in several files, such as:

For I = 0 ... MAX_REASONABLE:
- If the sample count module of file [i] is zoomed, exit.
- Get the final zoom sample from the file [I].
- Collapse them into a single sample (like average timestamps? Min minimum?) And average sample data.
- If the file [i + 1] does not exist, then <.
- Write a new sample to FILE [i + 1]

* 1 / (1- (1 / zoom) )) - You need an extra 100% extra ZO of Om = 2, and only 33% if zoom = 4, and so on.

When you have to visualize, you can quickly choose what is the closest level to show. Say you need to imagine 800 pixels from the 600,000 sampled range with Zoom = 2; So the logarithm of 600,000 / 800 divided by the zoom logarithm gives 9.55, which means that you need to reach zoom level 9.

That file has been zoomed to zoom ^ 9 = 512 times, that is, you see 600,000 / 512 = 1171 samples and 1171 x H image again in width of 800 pixels.

The total writing cost will increase by more than 300% of the average; The total storage requirements must be increased to a maximum of 100%;

I have worked on this type of system for rendering the map, and it is possible to show a rectangular area with a TerraPixel map (except through network throughput and latency, with real-time panning and zooming : There, we played with JPEG quality).

Search This Blog

PArth Code

.net - Approaches to fast visualization of huge amount of data -

Comments

Post a Comment

Popular posts from this blog

Python SQLAlchemy：AttributeError: Neither 'Column' object nor 'Comparator' object has an attribute 'schema' -

java - How not to audit a join table and related entities using Hibernate Envers? -

mongodb - CakePHP paginator ignoring order, but only for certain values -