Sunday, January 3, 2021

Introducing Data science

 

Data science.

In today's world, everything is stored as digital data.

When you go to an ATM and withdraw cash, book a cab, order a food,  posting in social media, you generate data on a huge level everywhere you go.

These data when stored in digital format can be used as a base for performing analysis and decision making.

We need a method and some tools to analyze and manage such large data.

 

Analysis is done with two major types of data.

  1. Traditional data

  2. Big data

     

Traditional data

Traditional data are stored in tables in the form of rows and columns.

The data is stored in database and can be manipulated using SQL queries.

         S.NO

         Name

   Phone number

         Age

     Residence

          1

    Srinidhi

     8529637410

          21

       chennai

          2

    Anjana

     7894561230

          24

       chennai

          3

    Preethi

     9216549870

          22

       villupuram

          4

    Ram

     7597532580

          40

       namakkal

          .

          .

    ….

    ….

       …..

       …..

          ..

          ..

        ……

        …..

         500

    Mohan

    9173052864

          59

       chennai

 

Big data

Big data on other hand, deals with extremely large amount of data.

Here, the data is either

  • Structured
  • Unstructured or
  • semi structured 
 

Structured data are stored in form of tables and stored in database(relational database system). The data generated from new registration in a website, bank account details, customer details in a shop, sales details in large companies, etc., are some examples.

The main difference between Traditional data and structured big data is that, Traditional data are generated for example say every hour but in big data data is generated every second.

The volume of traditional data ranges from Gigabytes to Terabytes

The volume of big data ranges from Zettabytes to Exabytes.

 

         S.NO

         Name

Phone number

         Age

    Residence

          1

    Srinidhi

     8529637410

          21

       chennai

          2

    Anjana

     7894561230

          24

       chennai

          3

    Preethi

     9216549870

          22

       villupuram

          4

    Ram

     7597532580

          40

       namakkal

          .

          .

    ….

    ….

       …..

       …..

          ..

          ..

        ……

        …..

         1000

    Mohan

    9173052864

          59

       chennai

        

   

    

          

        ……

        

   

    

          

        ……

        100000

   Mani

     9073597880

         50

       nagercoil


Unstructured data deals with data in form of audios and videos.

We can not store unstructured data in from of rows and column(tables)

The data from television media, audio files and photos from Facebook or Instagram.

They do not have a defined data model and they are not organized in specific manner.

 

Semi structured data contains tags and elements (Metadata) which is used to group data and describe how the data is stored.

We can not store these type of data in tables.

Satellite data, our emails, zipped files, and websites are some examples for semi structured data

No comments:

Post a Comment

Anaconda Installation

In this post we will discuss about Anaconda and its installation. Anaconda comes with number of applications(jupyter notebook included), dat...