Towards Energy Efficient Big Data Gathering In Densely Distributed Sensor Networks
INTRODUCTION:- Recent development of various areas of Information and Communication Technology (ICT) has contributed to an explosive growth in the large amount of data. According to a report published by IBM. 90 percent of the data in the world was generated in the last couple of years. In the recent years big data concept has emerged widely, which is currently attracting much attention from government, industry, and academia. As shown in Fig. 1, the big data comprises high volume, velocity, and variety information assets, which are difficult to collect, store, and process by using the available technologies.
The variety indicates that the data is of highly varied structures (e.g. data generated by a wide range of sources such as Machine-to-Machine (M2M), Radio Frequency Identification(RFID), and sensors) while the velocity refers to the high speed processing/analysis processing/analysis (e.g., fast database transactions, click-streaming, and so forth). Although currently used services (e.g. social networks, network switches, cloud storage and so forth) are already generating much volume of the big data it is anticipated that more and more data will be generated by sensors/RFID devices such as motion sensors, accelerometers, atmospheric sensors,, thermometric sensors,and so on. In fact, according to a report by ORACLE the volume of data devices and sensors and RFID sensors is expected to reach the order of petabytes.As shown in Fig. 1, the sensors are responsible for generation of big data in big volume and also in a wide varietygenerated by RFID.
AIM:- To propose
an effective solution to reduce the energy consumption in the sensor networks
and to utilize the sink node’s mobility to facilitate the data gathering. Here, a new mobile sink routing and data
gathering method through network clustering based on modified
ExpectationMaximization (EM) technique.
Synopsis:- Mobile
wireless sensor networks can simply be defined as a wireless sensor network
(WSN) in which the sensor nodes are mobile. Sensor networks are smaller, when
they emerge into field of research in contrast to their well-established
predecessor.
Sensor Networks are much more versatile than static sensor
networks as they can be deployed in any scenario and cope with rapid topology
changes. Commonly the nodes consist of a radio transceiver and a micro
controller powered by a battery. Also some kind of sensor for detecting heat,
light, humidity, temperature, etc.
In
this section, we first outline the clustering problem in WSN using mobile sink
and the challenges in solving this problem. After that, we introduce the
considered network model and the overview of EM algorithm for clustering. Based
on EM algorithm, we proposed our clustering method and the procedure to gather
data using the proposed method.
Twitter is a
well-known social and micro-blogging
website which allows millions of users to interact over different types of
communities, topics, and tweeting trends. The big data being generated on
Twitter daily, and its impact on social networking, has motivated the
applications of data mining (analysis) to extract necessary information from
tweets. In this paper, we find the impact of tweets based on the spectral
clustering.
Existing System:- The systems are used in many situations recently and provide
various information. Although they play an important role in our life, their
performance is not sufficient in terms of real time data collection. We discuss
the requirements for the next generation data gathering.
Although the sensor networks have provided
essential services, there are some shortcomings such as their coverage’s and
mobility, many of the existing systems for sensor networks based on wired or
wireless ground infrastructures are used to collect data from sensor terminals.
But the coverage of the networks is limited and creating new infrastructure for
remote areas is difficult for both economical and physical reasons.
Drawbacks
Of Existing System:- The network
is divided to some subnetworks because of the limited wireless communication
range. For example, sensors deployed or placed in a building may not be able to
communicate with the sensors which are distributed in the neighboring
buildings. Therefore, limited communication range wil pose a challenge for data
collection from all sensor nodes.
The wireless
transmission consumes the energy of the sensors. Even though the large amount
of data generated by an individual sensor is not significant, each sensor
consumes lot of energy to relay the data
generated by surrounding sensors.
Proposed System:- The main
motivation is to focus on the effect of data request messages by increasing the
number of clusters. Based on a common data gathering model of the densely
distributed WSNs, we demonstrate that the number of data request messages has a
noticeable impact on the energy consumption of the sensor nodes. When the
connectivity of the nodes increases, the impact also becomes bigger.
The mobile
sink is responsible to collects the data from the nodes in the cluster. It is
easy to see that delay is main problems of using mobile sink in WSNs .To
shorten this delay we implemented Expectation Maximization (EM) algorithm.
Modules:-
Cluster Creation:- WSN are
autonomous systems consisting of mobile hosts that are connected by multi hop
wireless links. In this cluster head (CH) is elected according to its weight
computed by combining a set of system parameters (Mobility). Sensor nodes are
equipped with store sensed information until mobile sink approaches the cluster
centroid.
Twitter Data
Generation:- Twitter is a
highly popular platform for information exchange, can be used as a data-mining
source which could aid in the aforementioned challenges which is collected by
sensor nodes. Specifically, using a large data set of harvested tweets, sensor
nodes connect with sink to transfer the dataset to HDFS system.
The REST APIs provides programmatic access to write and
read Twitter data. And also REST API
identifies Twitter applications and responses are available in JSON.
EM
computation:- The sink
node sends data request message to cluster head to invoke data transmission
from sensor nodes when it arrives at the cluster centroids. The nodes that
receive data request message send the data to the sink node and broadcast data
request message to their neighboring nodes using multi hop traversal. It was
realized that clustering can be based on probability models to cover the
missing values. This has led to the development of new clustering methods such
as Expectation
Maximization (EM) that is based on the principle
of Maximum Likelihood of unobserved variables in finite mixture models.
Data collection:- Once, the
mobile sink patrols every cluster centroid and collects the data from the nodes
in the cluster. This leads to transfer the sensor data to HDFS system with less
energy consumption. The spectral clustering is performed to perform data analytics
based on the Hash tag, Location and retweet count.
Motivation:- The
following challenges and benefits give us the motivation to developing this
product.
Challenges:-
We first outline the clustering problem in WSN using
mobile sink and the challenges in solving this problem. After that, we
introduce the considered network model and the overview of EM algorithm for
clustering. Based on EM algorithm, we proposed our clustering method and the
procedure to gather data using the proposed method.
Architecture
Diagram:-
Our implementation suggested that
energy efficient big data gathering in such networks is, indeed, necessary.
Where as the conventional mobile sink schemes can reduce energy consumption of
the sensor nodes, they lead to a number of challenges such as determining the
sink node’s trajectory and cluster formation prior to data collection. To point
out these challenges, we proposed a
mobile sink based data collection method by introducing a new clustering
method. Here clustering method uses a modified Expectation Maximization
technique.
IMPLEMENTATION:-
Implementation literally means to put into effect or to
carry out. The system implementation phase of the software deals with the
translation of the design specifications into the source code. The ultimate
goal of the implementation is to write the source code and the internal
documentation so that it can be verified easily. The code and documentation
should be written in a manner that eases debugging, testing and modification.
System flowcharts, sample output,sample run on packages, etc. Is part of the
implementation?
Various
types of bugs were discovered while debugging the modules. These ranged from
logical errors to failure on account of various processing cases.
System Implementation:-
A
post-implementation review is an evaluation of the extent to which the system
accomplishes stated objectives and actual project costs exceed initial
estimates.
After the system is implemented and conversion is
complete, a review should be conducted to determine whether the system is
meeting expectations and where improvements are needed. A post implementation
review measures the systems performance against predetermined requirements. It
determines how well the system continues to meet performance specifications. It
also provides information to determine whether major re-design or modification
is required.
There are five things in
consideration when the project
is developed. They are as
follows:-
v Adaptation
v Prevention/Integrity
v Enhancement
v Correction
Adaptation/Enhancement:-
In this Project a high performance data synchronization
server for mobile device is proposed. For the mobile application system, the
information or data (ex. Contacts, Music, Video, Image) sets are usually stored
in both the mobile device and system database. After several operations for the
mobile system, the data sets between the mobile device and system database may
become not identical..
Prevention/Integrity:-
Security has been the measure aspect in the prevailing
system and is to be considered the primary key for any successful of the
project. The software developed here, Mobile-Sync,
has been given a full security providing each TESTERS with their access. We
know that:
Integrity= [1-
(security*(1-threat)]
Every
measure is employed to secure the system from any types of threats. Integrity
has been tried to maintain to its accuracy.
Correction:-
The project is corrective to its
end and all the validation has been
incorporated to software developed so that no further corrective action can be
thought of.
NOTE:-
The software has been developed keeping in mind the
requirements of the Share Investors to share application. One of the most
important factors in developing any application is experience. Due to lack of
experience, We might have overlooked some things that should be put into
consideration.
Maintenance:-
Maintenance activities involve making
enhancements to software products, adapting products to new environments, and
correcting problems. Adaptation of software to a new environment may involve
moving the software to a different machine. Problem correction involves
modification and revalidation of software to correct errors.
Maintenance activities consume a large
portion of the total life cycle budget. Software Maintenance accounts for 70
percent of total software life-cycle costs. Maintenance includes 60 percent of
maintenance budget for enhancement, and 20 percent each for adaptation and
correction. The primary product attributes that contribute to software
maintainability are clarity, modularity, and good internal documentation of the
source code, as well as appropriate supporting documents.
Documentation:-
Documentation is a method of communication. A
satisfactory documentation of the system should be objective, factual and
complete. Thus format, length, volume or complexity does not determine its
adequacy. In documentation, there are no uniform standards that are applicable
to all system projects.
Embedding Comments in the executable portion of
the code did proper documentation of each module. To enhance the readability of
the comments, indentation, parenthesis, blank lines and spaces, proper lining
of the loops were used around the block of comments. Care was also taken to use
descriptive names of tables, fields, modules, forms etc. The proper use of
indentation, parenthesis, blank lines and spaces were also ensured during
coding to enhance the readability of the code.
REFERENCES:-
[1] IBM, “Four vendor views on big Data and big data
analytics: IBM,” http://www.01.ibm.com/software/in/data/bigdata/,
Jan. 2012.
[2]
A.
Divyakant, B. Philip, and et al.,
“Challenges and opportunities with Big Data,” 2012, a community white paper developed by leading researchers across
the United States.
[Online].
S. Sagiroglu
and D. Sinanc, “Big data: A review,” in International
Conference on Collaboration Technologies and Systems (CTS), 2013
[3] D. Baum and CIO
Information
Matters, “Big Data, big
opportunity,”
http://www.oracle.com/us/ccentral/ciosolutions/informationmatters/big-databig-opportunity/index.html.
[4] L. Ramaswamy, V. Lawson, and S. Gogineni, “Towards a
qualitycentric big Data architecture for federated sensor services,” in IEEE International Congress on Big Data
(BigData
Congress),
2013.
[5]
C.-C. Lin,
M.-J. Chiu, C.-C. Hsiao, R.G. Lee, and Y.-S. Tsai, “Wireless health care
service system for elderly with dementia,” IEEE
Transactions on Information Technology in
[6]
Biomedicine,
vol. 10, no. 4, pp. 696– 704, 2006.
[7]
P. Ross,
“Managing care through the air [remote health monitoring],” IEEE Spectrum,
vol.41, no. 12, pp. 26-31,2004
No comments:
Post a Comment