Statistics Performance
Vaex can calculate statistics such as mean, sum, count, standard deviation, etc., on an N-dimensional grid up to a billion (109) objects/rows per second. So, Let’s Compare the performance of pandas and Vaex while computing statistics:-
Pandas Dataframe:
Python3
% time df_pandas[ "column3" ].mean() |
Output:
Wall time: 741 ms 49.49811570183629
Vaex DataFrame:
Python3
% time df_vaex.mean(df_vaex.column3) |
Output:
Wall time: 347 ms array(49.4981157)
Introduction to Vaex in Python
Working on Big Data has become very common today, So we require some libraries which can facilitate us to work on big data from our systems (i.e., desktops, laptops) with instantaneous execution of Code and low memory usage.
Vaex is a Python library which helps us achieve that and makes working with large datasets super easy. It is especially for lazy Out-of-Core DataFrames (similar to Pandas). It can visualize, explore, perform computations on big tabular datasets swiftly and with minimal memory usage.
Contact Us