Literature Review

Storage, processing, and retrieval of big data in the cloud are significant problems in current research. Pedro et al. [287] studied the overview of present and future issues. The document discusses scalability and fault tolerance of various vendors including Google, IBM, Nokia, and RedBus. The authors further considered the security, privacy, integrity, disaster recovery, and fault tolerant issues. The authors [2] discussed the review of current service models, import concepts of cloud computing, and processing of big data. Elmustafa and Rashid [26] presented the survey issues of big data security in cloud computing.

Linda et al. [180] showed the environmental examples of big data use in government that includes Environmental Protection Agency, Department of the Interior, Department of Energy, and Postal Services. The study consists of the government open access initiatives, federal data center consolidation initiative, and the enforcement of compliance online. James [195] presented a roadmap to the success of big data analytics and applications. The report discusses the definition and description of unstructured data, relevant use cases in the cloud, potential benefits, and challenges associated with deploying in the cloud.

The impact of cloud computing on Healthcare is studied in [290]. The study includes on-demand access to computing and large storage, supporting big data sets for electronic health records, and the ability to analyze and track the health records. Alan presented the environmental sustainability of big data, barriers, and opportunities. It also includes new opportunities for partnership based collaboration, sustainability to organizations to big data efforts, and emerging business models.

Yan et al. [391] discussed the access control in cloud computing. The paper contains the temporal access control in cloud computing using encryption techniques. Yuhong et al. [230] discussed data confidentiality in cloud computing. The article uses the trust-based evaluation encryption model. In this model, the trust factor decides the access control of user status. Young et al. [265] discussed the security issues in cloud computing. The paper describes the access control requirements, authentication and ID management in the cloud.

Ali and Erwin [140] reviewed security and privacy issues on big data and cloud aspects. They concluded that cloud data privacy and safety is based on the cloud provider. They also discussed big data security challenges and cloud security challenges. Their paper examines the security policy management and big data infrastructure and programming models. They did not suggest any particular model but discussed all possible solutions for the security of big data in the cloud.

Marcos et al. [51] presented approaches and environments to carry out big data computing in the cloud. The paper discusses the visualization and user interaction, model building, and data management. Venkata et al. [177] examined issues in a cloud environment for big data. The primary focus is security problems and possible solutions. Further, they discussed MapReduce and Apache environments in the cloud and the need for security.

Saranya and Kumar [322] addressed the security issues associated with big data in a cloud environment. They suggested a few approaches for the complicated business environment. The paper discusses unstructured big data characteristics, analytics, Hadoop architecture, and real-time big data analytics. The authors did not present any particular model in the article. They explained a few concepts related to security in a cloud environment.

Avodele et al. [55] presented issues and challenges for deployments of big data in the cloud. They suggested solutions that are relevant to organizations to deploy the data in the cloud. The authors indicated the importance of authentication controls and access controls.

Security in loT was discussed in [249], [331]. The authors conclude that loT networks are hugely needed to ensure confidentiality, authentication, access control, and integrity, among others. The reason for immediate attention to security is a dramatic increase in the number of connected devices. These devices create technical problems such as attacks with a broader scope of influence and attacks that last longer. Therefore, immediate attention and procedures are required to detect the hacker to avoid the damage to sensitive information.

Zhang and Wu [399] discussed the generation of data as trustworthy or not. Further, the paper focused on access control models on trust computing and useful guidelines. Dina et al. [174] investigated the requirements in loT to a candidate's vision. The authors thought the access control should be implemented during the requirements stage. Ali et al. [35] discussed the device authentication and access control in loT. The put the efforts to demonstrate the security requirement to eliminate the possible cyber-attacks. Situational access control in loT was discussed by Roei et al. [85]. They identified that the situational tracking requires cross-framework interaction and permission. In this process the system tracks to sense the situation, infer the situation environment and then activate the process.

The remaining paper discusses the problem formulation that leads to the authentication model in the cloud section 6.4, simulations in section 6.5, access control algorithms in section 6.6, and conclusions and future work in section 6.7.

Problem formation

The security model involves the cloud customer data security at storage, retrieval, transfer, processing, and updates (insert, modify, delete). The security needs to be set at the log entry at user and cloud level. It also requires the automatic validation of stored data status and verifies the trust level. The framework of the proposed model includes the data encryption, correctness, and processing. These three modes depend upon the access rights of the user as discussed at the beginning of the current section. For storage and retrieval of data the basic encryption techniques, Advanced Encryption Standards (AES), Rivest-Shamir-Adleman (RSA), and steganography model are sufficient. If the data requires storage and processing the recommendations in [106] may be useful. The paper discussed the various techniques to search cipher text and query isolation (avoid the untrusted server). Controlled searching, dealing with variable word lengths, searching encrypted index, and support for hidden search are part of the research. In this paper, the proposed access control model with encrypted processing data is useful to avoid the untrusted provider and malicious users in the cloud.

The access control model for loT of cloud storage incorporates the authentication of customer and its current access level. The token identification (TID) is attached as soon as the user logs in into the system. To maintain the security of data and its trust level, we have to define many control parameters to the user access in the cloud. The current TID model in equation 6.1 explains with seven parameters.

where

UID User Identification and access rights

IID Issue Date

MDT Maximum Date (expiration Date)

ТА Time of Access

PA Place of Access (current place and node ID)

LGE Log Entry (UID, IID, MDT, ТА, PA, LGE) SA Security Alarm

The customer is an owner of the data or another customer. In either case, the customer is a client with different access rights. Once the customer logs in to the cloud network, the authentication access token connects to the user account. The token verifies the customer/user access limits and allows or denies appropriate file access. Further, the system does the entries in the customer/user and cloud log table for each attempt of a user to a particular file with all details. The various validating and verification checking the modifications help to find unauthorized access. The trustworthiness of provider or customer/user can be calculated using the log values.

The trustworthiness of a customer/user can be calculated using trust function in equation 6.2. For each entry of the user, the weight 'W' is assigned. The entry Wjtj means ith user and j,b entry. Let Ng be the number of times the user has right behavior, and N;, is the number of times of bad behavior of the ith user. Multiply the user entry value with weight 'W with right or wrong actions and calculate the trustworthiness of a customer/user. The trustworthiness T, of ith user/customer is calculated as follows.

If T, the trust value is above the threshold, the customer/user is considered good; otherwise a false alarm alerts the owner. The user may be a customer, provider, or owner. The weight varies between 0 to 1 and the number of times there is access to the data (or data files) will be 0 to 10. If weight =0 then x=l else x=0. Similarly, if the number of times =0 then у value is 1 otherwise y=0.

Improved Trust calculation

The improved trustworthiness of the user can be calculated as shown in equation 6.3. The equation 6.3 estimates the trustworthiness Г, of the ilh user.

Initially, Tj is set to 0. We provide the initial values to calculate the first T,. If Ti, the trust value, is below the set threshold, the user is considered bad and signals the false alarm. The user may be a customer, provider, or owner. The weight varies between 0 to 1 and the number of times the data (or data files) has been accessed will be 0 to 10. If weight =0 then .v=1 else ,v=0. Similarly, if number of times the cloud has been accessed = 0 then у value is 1 otherwise y=0.

Experiment 2

Figure 6.2: Experiment 2: Average of 10 random accesses of a user to data by

each user

Simulations

The simulations on equation 6.2 were performed and provided in Figures 6.2, 6.3, and 6.4. The threshold value for legal (trusted) user was fixed at 0.85 and above (>= 0.85) in the current situation. We assumed that every user logged in to the internet is not a completely trusted user (may be hacker or malicious user) at any assumed average trusted level. For example, if the threshold value is greater than 0.85 on an average ten (10) attempts, then the user is legal (assumption) or trusted. The program was developed in MATLAB language to create a graph for equation 6.2. Figures 6.2, 6.3, and 6.4 present the sample results. The random data generated for each user and plotted the average of 10 access values. Figure 6.3 shows that the user is malicious or untrusted based on the number of attempts to the system or trying to access to specific database or information. The result automatically triggers the alarm to security manager. That means the trust level of user accesses calculated is below 0.85. In our simulations, the user access values depend upon the random values and the corresponding weights selected provided during the execution. The proposed data is a random selected sample for the test calculation. The complex calculation requires a real data (not provided) since it has various parameters for each user logging and processing the data.

The improved trustworthiness of user is calculated as in Figures 6.5, 6.6, and 6.7. The threshold value depends on a selection that the user is legal or malicious. As in the previous case, if the user is genuine, the threshold value is higher than 0.85 for an average access of a user to cloud equals ten (10). The simulations are created using MATLAB and results are shown in Figures 6.5, 6.6,

Experiment 3

Figure 6.3: Experiment 3: Average of 10 random accesses of a user to data by each user (sample graphs)

Experiment 5

Figure 6.5: Experiment 5: Average of 10 random accesses of a user to data by each user (sample graphs)

Experiment 7

Figure 6.7: Experiment 7: Average of 10 random accesses of a user to data by each user (sample graphs)

and 6.7. The random data generated for each user and plotted the average of 10 access values. The figures indicate zero or more malicious users. Figure 6.7 shows more hacking activities compared to Figure 6.5. Note that the proposed data is a simple test calculation.

Access Controls on Sensitive Data

Access to sensitive data cannot satisfy pure trustworthiness. Along with trustworthiness, the procedure requires the user access limits, day, time of the day, and log entry for validation. The user identification contains access rights and UID issue date, expiration date, time of access, and location of access (depends upon sensitiveness of data). Once the user logs in to the system, the cloud log, and owner log's entries are automatically registered. For hackers, only cloud log entry appears. The various validation and verification checks reveal the hacking. The token identification parameters in equation 6.1 are used in objective function G.

The objective function G replaces TID, N replaces UID (contains IID, MDT, ТА, and PA), D is a data file (or database), and U replaces LGE. The security alarm will be activated depending upon the hacker identification or trust failure. Therefore the parameters are explained further as below.

N the set of users (n, n2l..., n,„)

A set of access rights (a,a2, —,ar)

D set of allowed resources in file or database (d,d2, —,d,1)

U the result of the query and log entries for verification and validation.

Once the authenticated user и,(и,- e N) logs into cloud environment, the CCCRN service attaches я, service token to a resource within its domain with a set of access types. The limitation helps to control the user for resource access. For every service requested by the user, the system generates a set of access permissions to the resources. The services required should not exceed the user access limits. If the resource requirements are outside the user boundaries, then the system alarms the security and denies the request. Hacker is a user that does not have any role in the system. An authorized user will be treated as a hacker if the user tries to access unauthorized information. For example, the healthcare staff member will be considered as an intruder if the user accesses unauthorized data or misuses (for instance, printing and forwarding) the authorized information.

In the proposed CCCRN environment the user with complete authorization access is called a super user (S). The super user 'S' possesses access rights of all users S 2 я, where 2 means contains. All accesses of super user on the database

i=l,n

must be recorded. The user that does not have authorization to resource (s) is called hacker /г, and represented as H(/z, e H) and VH(hackers) the access rights aih di = (p is true; aih is access rights of the hackers (i-> implication to, and = is equivalent to). Using this information, we design two algorithms.

Algorithm 1:

If the query Q{n„dj) matches the и, as owner for token identification (TokenID), then the corresponding utility function г<, will be generated, else the query reflects as Q(«,, Ы,), where h is a hacker.

If the hacker is an internal user then

Inij 2 и,- + h'dj (», Internal user), alarms security manager about internal hacker.

if Q(nirdi) с щ else

if Q(ni,di) £ i= Hi + h'di then

Convert Q(>ij,dj) as Q(iv,,hdi) and generate /гг/,- 2 и,- + lidj

Store the user utility hit, that contains г<, + h'dj and inform security and keep the counter (log) in alert for further attempts.

The Algorithm 1 helps to detect the hacker if the user tries to gain the information with unauthorized access from the database. The following query and Table 6.1 explains the unauthorized access to information.

Q(nj,dj) = Q(hnirdi) £ UjOrQ(hnj,dj) » hut then Q(hrij,di) = Inii retrieve hu,• (utility from the Hacker alarm to database) and alert the security alarm, where hu, is available in log or identified as a new hacker and logged as new entry. The log is provided in Table 6.1.

Cloud Network Management: An loT Based Framework Table 6.1: Hacker Log and Action

Hacker

Status

Result

Action

A

new

llU;

New hacker, alarm

A

repeat

hu;

Alarm and freeze

In general, if the hacker attempts to gain access to the database at different times, the time attribute plays an important role to detect the hacker. The Algorithm 1 is modified as Algorithm 2.

Algorithm 2:

Q(nj,tj,d;) is genuine and attempted during duty times then corresponding utility function it; will be generated,

else the query reflects as Q(n;,t„hdj) then user will get /г»,- 2 и, + hid; (where щ Internal user information h'd; is the hacker alarm at time tj). if Q(nirti,dj) c iij then exit(user access accepted) else

if(Q(tii,tj,dj) <£ Ui&c&c Q{n„ tj, dj) = hit;)

Convert Q{n;,ti,di) as Q{n;,t„hdi) and generate hu; 2 it; + lid, (alarm alert to Security manager)

Note: Store the user utility Ini; that contains it; + lid; and alert security and keep the counter for further attempts.

If the hacker is external then divert to the KDS. If the user hacks with authentication then the time stamp will help to detect the hacker. For example,

If Q(ntj, dj) = Q(hn;,tj,d;) с hit; then Q{hn;,tj,d,) = Ini;

Retrieve Ini; and alarm the security, where Ini; is available in log or identified as a new hacker and logged as a new entry. Table 6.2 provides the log entries.

Table 6.2: Hacker Log and Detection

Hacker

Status

Time

Result

Action

A

New, internal

Outside-bound

hui

Detect as internal hacker and alarm

A

Repeated, internal

Within-bounds

hui

Check for presence of real user and alarm and find real user

Depending upon the security level, the Algorithm 2 will be modified by adding the terminal type and log-on timings. Terminal type and time of access attributes along with access type attributes will protect the secret and top secret information.

Let us assume the hospital environment in the healthcare system. A doctor and nurse have same access rights to individual patient data (doctor prescribes the medicine and implemented by the nurse). Then the attributes patient id, type of medication, and scheduled time dose to be given to a patient are accessible by the nurse. The same attributes are also available by the doctor. Therefore, the system security depends upon the merge and decomposition of two or more users.

Conclusions

The issues and challenges in loT [249],[36], [313], [404], [353],[60], [120], [331] processing of complex data in cloud and security issues were discussed in [391], [55], [193], [390]. We found that it is required to develop a trust and access control methodology in loT and cloud environment for real-time access to data and processing for decision making. Therefore, in the current research, an objective function was proposed with a set of users, associated access rights, resources and return result verification. The proposed model is appropriate for the big data in a cloud environment where loT devices are involved. Further, two algorithms were presented where the model can be extended to Hadoop distributed file systems to detect the external and internal hackers in a cloud environment. The tables were presented for hacker detection through algorithms. The user entry logs, authentication, and access rights have a significant role in providing the hacker information to the security administrator.

 
Source
< Prev   CONTENTS   Source   Next >