With the exponential growth of data volume, big data have placed an unprecedented burden on current computing infrastructure. Dimensionality reduction of big data attracts a great deal of attention in recent years as an efficient method to extract the core data which is smaller to store and faster to process. This paper aims at addressing the three fundamental problems closely related to distributed dimensionality reduction of big data, i.e., big data fusion, dimensionality reduction algorithm and construction of distributed computing platform. A chunk tensor method is presented to fuse the unstructured, semi-structured and structured data as a unified model in which all characteristics of the heterogeneous data are appropriately arranged along the tensor orders. A Lanczos based high order singular value decomposition algorithm is proposed to reduce dimensionality of the unified model. Theoretical analyses of the algorithm are provided in terms of storage scheme, convergence property and computation cost. To execute the dimensionality reduction task, this paper employs the transparent computing paradigm to construct a distributed computing platform as well as utilizes a four-objectives optimization model to schedule the tasks. Experimental results demonstrate that the proposed holistic approach is efficient for distributed dimensionality reduction of big data.
To View the Base Paper Abstract Contents
Now it is Your Time to Shine.
Great careers Start Here.
We Guide you to Every Step
Success! You're Awesome
Thank you for filling out your information!
We’ve sent you an email with your Final Year Project PPT file download link at the email address you provided. Please enjoy, and let us know if there’s anything else we can help you with.
To know more details Call 900 31 31 555
The WISEN Team