Metadata Discovery Problem - In order to enable a collaborative Model-based Systems Engineering (MBSE) environment through computer systems, it is completely necessary to enable the possibility of communicating tools (interoperability) and reusing previous engineering designs saving costs and time. In this context, the understanding of the underlying concepts and relationships embedded in the system artifacts becomes a cornerstone to properly exploit engineering artifacts. MBSE tool-chains and suites, such as Matlab Simulink, can be applied to different engineering activities: architecture design (descriptive modeling), simulation (analytical modeling) or verification. Reuse capabilities in specific engineering tools are a kind of non-functional aspect that is usually covered providing a type of search capability based on artifact metadata. In this work, we aim to ease the reuse of the knowledge embedded in Simulink models through a solution called PhysicalModel2Simulink. The proposed approach makes use of an ontology for representing, indexing and retrieving information following a meta-model (mainly to semantically represent concepts and relationships). Under this schema, both meta-data and contents are represented using a common domain vocabulary and taxonomy creating a property graph that can be exploited for system artifact discovery. To do so, a mapping between the Matlab Simulink meta-model and the RSHP (RelationShHiP) meta-model is defined to represent and serialize physical models in a repository. Then, a retrieval process is implemented on top of this repository to allow users to perform text-based queries and look up similar artifacts. To validate the proposed solution, 38 Simulink models have been used and 20 real user queries have been designed to study the effectiveness, in terms or precision and recall, of the proposed solution against the Matlab Simulink searching capabilities.
Authored by Eduardo Cibrian, Roy Mendieta, Jose Alvarez-Rodriguez, Juan Llorens
Metadata Discovery Problem - Collaborative software development platforms like GitHub have gained tremendous popularity. Unfortunately, many users have reportedly leaked authentication secrets (e.g., textual passwords and API keys) in public Git repositories and caused security incidents and finical loss. Recently, several tools were built to investigate the secret leakage in GitHub. However, these tools could only discover and scan a limited portion of files in GitHub due to platform API restrictions and bandwidth limitations. In this paper, we present SecretHunter, a real-time large-scale comprehensive secret scanner for GitHub. SecretHunter resolves the file discovery and retrieval difficulty via two major improvements to the Git cloning process. Firstly, our system will retrieve file metadata from repositories before cloning file contents. The early metadata access can help identify newly committed files and enable many bandwidth optimizations such as filename filtering and object deduplication. Secondly, SecretHunter adopts a reinforcement learning model to analyze file contents being downloaded and infer whether the file is sensitive. If not, the download process can be aborted to conserve bandwidth. We conduct a one-month empirical study to evaluate SecretHunter. Our results show that SecretHunter discovers 57\% more leaked secrets than state-of-the-art tools. SecretHunter also reduces 85\% bandwidth consumption in the object retrieval process and can be used in low-bandwidth settings (e.g., 4G connections).
Authored by Elliott Wen, Jia Wang, Jens Dietrich
Metadata Discovery Problem - Millions of connected devices like connected cameras and streaming videos are introduced to smart cities every year, which are valuable source of information. However, such rich source of information is mostly left untapped. Thus, in this paper, we propose distributed deep neural networks (DNNs) over edge visual Internet of Things (VIoT) devices for parallel, real-time video scene parsing and indexing in conjunction with BigQuery retrieval on stored data in the cloud. The IoT video streams parsed into adaptive meta-data of person, attributes, actions, object, and relations using pre-trained DNNs. The meta-data cached at the edge-cloud for real-time analytics and also continuously transferred to the cloud for data fusion and BigQuery batch processing. The proposed distributed deep learning search platform bridges the gap between edge-to-cloud continuum computation by utilizing state-of-the-art distributed deep learning and BigQuery search algorithms for the geo-distributed Visual Internet of Things (VIoT). We show that our proposed system supports real-time event-driven computing at 122 milliseconds on virtual IoT devices in parallel, and as low as 2.4 seconds batch query response time on multi-table JOIN and GROUP-BY aggregation.
Authored by Arun Das, Mehdi Roopaei, Mo Jamshidi, Peyman Najafirad
Metadata Discovery Problem - To conduct a well-designed and reproducible study, researchers must define and adhere to clear inclusion and exclusion criteria for subjects. Similarly, a well-run journal or conference should publish easily understood inclusion and exclusion criteria that determine which submissions will receive more detailed peer review. This will empower authors to identify the conferences and journals that are the best fit for their manuscripts while allowing organizers and peer reviewers to spend more time on the submissions that are of greatest interest. To provide a more systematic way of representing these criteria, we extend the syntax for concept-validating constraints of the Nexus-PORTAL-DOORS-Scribe cyberinfrastructure, which already serve as criteria for inclusion of records in a repository, to allow description of exclusion criteria.
Authored by Adam Craig, Carl Taswell
Metadata Discovery Problem - We present a methodology for constructing a spatial ontology-based datasets navigation model to allow cross-reference navigation between datasets. We defined the structure of the dataset as metadata, the field names, and the actual values. We defined the relationship between datasets as 3-layer such as metadata layer, field names layer, and data value layer. The relationships in the metadata layer was defined as the correspondence between metadata values. We standardized the field names in dataset to discover the relationships between field names. We designed a method to discover the relationship between data values based on common knowledge datasets for each domain. To confirm the validity of the presented methodology, we applied our methodology to implement an ontology-based knowledge navigation model for actual disasterrelated processes in operation. We built a knowledge navigation model based on spatial common knowledge.
Authored by Yun-Young Hwang, Sumi Shin
Metadata Discovery Problem - We defined and expressed graph-based relationships of pieces of knowledge to allow cross-reference navigation of the knowledge as an ontology. We present a methodology for constructing an ontology-based knowledge navigation model to allow cross-reference navigation between pieces of knowledge, related concepts and datasets. We defined the structure of the dataset as metadata, the field names of the actual values, and the actual values. We defined the relationship between datasets as 3-layer such as metadata layer, field names layer, and data value layer. The relationships in the metadata layer was defined as the correspondence between metadata values. We standardized the field names in dataset to discover the relationships between field names. We designed a method to discover the relationship between data values based on common knowledge for each domain. To confirm the validity of the presented methodology, we applied our methodology to implement an ontology-based knowledge navigation model for actual disaster-related processes in operation. We built a knowledge navigation model based on spatial common knowledge to confirm that the configuration of the knowledge navigation model was correct.
Authored by Yun-Young Hwang, Jiseong Son, Sumi Shin
Metadata Discovery Problem - The OPC UA (Open Platform Communications Unified Architecture) technology is found in many industrial applications as it addresses many of Industry 4.0’s requirements. One of its appeals is its service-oriented architecture. Nonetheless, it requires engineering efforts during deployment and maintenance to bind or associate the correct services to a client or consumer system. We propose the integration of OPC UA with the Eclipse Arrowhead Framework (EAF) to enable automatic service discovery and binding at runtime, reducing delays, costs, and errors. The integration also enables the client system to get the service endpoints by querying the service attributes or metadata. Moreover, this forms a bridge to other industrial communication technologies such as Modbus TCP (Transmission Control Protocol) as the framework is not limited to a specific protocol. To demonstrate the idea, an indexed line with an industrial PLC (programmable logic controller) with an OPC UA server is used to show that the desired services endpoints are revealed at runtime when querying their descriptive attributes or metadata through the EAF’s Orchestrator system.
Authored by Aparajita Tripathy, Jan Van Deventer, Cristina Paniagua, Jerker Delsing
Metadata Discovery Problem - Researchers seeking to apply computational methods are increasingly turning to scientific digital archives containing images of specimens. Unfortunately, metadata errors can inhibit the discovery and use of scientific archival images. One such case is the NSF-sponsored Biology Guided Neural Network (BGNN) project, where an abundance of metadata errors has significantly delayed development of a proposed, new class of neural networks. This paper reports on research addressing this challenge. We present a prototype workflow for specimen scientific name metadata verification that is grounded in Computational Archival Science (CAS), report on a taxonomy of specimen name metadata error types with preliminary solutions. Our 3-phased workflow includes tag extraction, text processing, and interactive assessment. A baseline test with the prototype workflow identified at least 15 scientific name metadata errors out of 857 manually reviewed, potentially erroneous specimen images, corresponding to a ∼ 0.2\% error rate for the full image dataset. The prototype workflow minimizes the amount of time domain experts need to spend reviewing archive metadata for correctness and AI-readiness before these archival images can be utilized in downstream analysis.
Authored by Joel Pepper, Andrew Senin, Dom Jebbia, David Breen, Jane Greenberg
Metadata Discovery Problem - Semantic segmentation is one of the key research areas in computer vision, which has very important applications in areas such as autonomous driving and medical image diagnosis. In recent years, the technology has advanced rapidly, where current models have been able to achieve high accuracy and efficient speed on some widely used datasets. However, the semantic segmentation task still suffers from the inability to generate accurate boundaries in the case of insufficient feature information. Especially in the field of medical image segmentation, most of the medical image datasets usually have class imbalance issues and there are always variations in factors such as shape and color between different datasets and cell types. Therefore, it is difficult to establish general algorithms across different classes and robust algorithms that differ across different datasets. In this paper, we propose a conditional data preprocessing strategy, i.e., Conditional Metadata Embedding (CME) data preprocessing strategy. The CME data preprocessing method will embed conditional information to the training data, which can assist the model to better overcome the differences in the datasets and extract useful feature information in the images. The experimental results show that the CME data preprocessing method can help different models achieve higher segmentation performance on different datasets, which shows the high practicality and robustness of this method.
Authored by Juntuo Wang, Qiaochu Zhao, Dongheng Lin, Erick Purwanto, Ka Man
Metadata Discovery Problem - Open Educational Resources (OER) are educational materials that are available in different repositories such as Merlot, SkillsCommons, MIT OpenCourseWare, etc. The quality of metadata facilitates the search and discovery tasks of educational resources. This work evaluates the metadata quality of 4142 OER from SkillsCommons. We applied supervised machine learning algorithms (Support Vector Machine and Random Forest Classifier) for automatic classification of two metadata: description and material type. Based on our data and model, performances of a first classification effort is reported with the accuracy of 70\%.
Authored by Veronica Segarra-Faggioni, Audrey Romero-Pelaez
Measurement and Metrics Testing - In software regression testing, newly added test cases are more likely to fail, and therefore, should be prioritized for execution. In software regression testing for continuous integration, reinforcement learning-based approaches are promising and the RETECS (Reinforced Test Case Prioritization and Selection) framework is a successful application case. RETECS uses an agent composed of a neural network to predict the priority of test cases, and the agent needs to learn from historical information to make improvements. However, the newly added test cases have no historical execution information, thus using RETECS to predict their priority is more like ‘random’. In this paper, we focus on new test cases for continuous integration testing, and on the basis of the RETECS framework, we first propose a priority assignment method for new test cases to ensure that they can be executed first. Secondly, continuous integration is a fast iterative integration method where new test cases have strong fault detection capability within the latest periods. Therefore, we further propose an additional reward method for new test cases. Finally, based on the full lifecycle management, the ‘new’ additional rewards need to be terminated within a certain period, and this paper implements an empirical study. We conducted 30 iterations of the experiment on 12 datasets and our best results were 19.24\%, 10.67\%, and 34.05 positions better compared to the best parameter combination in RETECS for the NAPFD (Normalized Average Percentage of Faults Detected), RECALL and TTF (Test to Fail) metrics, respectively.
Authored by Fanliang Chen, Zheng Li, Ying Shang, Yang Yang
Measurement and Metrics Testing - The increase of smartphone users in Indonesia is the reason for various sectors to improve their services through mobile applications, including the healthcare sector. The healthcare sector is considered a critical sector as it stores various health data of its users classified as confidential. This is the basis for the need to conduct a security analysis for mobile health applications, which are widely used in Indonesia. MobSF (Mobile Security Framework) and MARA (Mobile Application Reverse Engineering and Analysis) Framework are mobile application security analysis methods capable of assessing security levels based on OWASP (Open Web Application Security Project) Mobile Top 10 2016 classification, CVSS (Common Vulnerability Scoring System) and CWE (Common Weakness Enumeration). It is expected that the test results with MobSF and MARA can provide a safety metric for mobile health applications as a means of safety information for users and application developers.
Authored by Dimas Priambodo, Guntur Ajie, Hendy Rahman, Aldi Nugraha, Aulia Rachmawati, Marcella Avianti
Measurement and Metrics Testing - FIPS 140-3 is the main standard defining security requirements for cryptographic modules in U.S. and Canada; commercially viable hardware modules generally need to be compliant with it. The scope of FIPS 140-3 will also expand to the new NIST Post-Quantum Cryptography (PQC) standards when migration from older RSA and Elliptic Curve cryptography begins. FIPS 140-3 mandates the testing of the effectiveness of “non-invasive attack mitigations”, or side-channel attack countermeasures. At higher security levels 3 and 4, the FIPS 140-3 side-channel testing methods and metrics are expected to be those of ISO 17825, which is based on the older Test Vector Leakage Assessment (TVLA) methodology. We discuss how to apply ISO 17825 to hardware modules that implement lattice-based PQC standards for public-key cryptography – Key Encapsulation Mechanisms (KEMs) and Digital Signatures. We find that simple “random key” vs. “fixed key” tests are unsatisfactory due to the close linkage between public and private components of PQC keypairs. While the general statistical testing approach and requirements can remain consistent with older public-key algorithms, a non-trivial challenge in creating ISO 17825 testing procedures for PQC is the careful design of test vector inputs so that only relevant Critical Security Parameter (CSP) leakage is captured in power, electromagnetic, and timing measurements.
Authored by Markku-Juhani Saarinen
Measurement and Metrics Testing - This paper belongs to a sequence of manuscripts that discuss generic and easy-to-apply security metrics for Strong PUFs. These metrics cannot and shall not fully replace in-depth machine learning (ML) studies in the security assessment of Strong PUF candidates. But they can complement the latter, serve in initial PUF complexity analyses, and are much easier and more efficient to apply: They do not require detailed knowledge of various ML methods, substantial computation times, or the availability of an internal parametric model of the studied PUF. Our metrics also can be standardized particularly easily. This avoids the sometimes inconclusive or contradictory findings of existing ML-based security test, which may result from the usage of different or non-optimized ML algorithms and hyperparameters, differing hardware resources, or varying numbers of challenge-response pairs in the training phase.
Authored by Fynn Kappelhoff, Rasmus Rasche, Debdeep Mukhopadhyay, Ulrich Rührmair
Measurement and Metrics Testing - Fuzz testing is an indispensable test-generation tool in software security. Fuzz testing uses automated directed randomness to explore a variety of execution paths in software, trying to expose defects such as buffer overflows. Since cyber-physical systems (CPS) are often safety-critical, testing models of CPS can also expose faults. However, while existing coverage-guided fuzz testing methods are effective for software, results can be disappointing when applied to CPS, where systems have continuous states and inputs are applied at different points in time.
Authored by Sanaz Sheikhi, Edward Kim, Parasara Duggirala, Stanley Bak
Measurement and Metrics Testing - Nowadays, attackers are increasingly using UseAfter-Free(UAF) vulnerabilities to create threats against software security. Existing static approaches for UAF detection are capable of finding potential bugs in the large code base. In most cases, analysts perform manual inspections to verify whether the warnings detected by static analysis are real vulnerabilities. However, due to the complex constraints of constructing UAF vulnerability, it is very time and cost-intensive to screen all warnings. In fact, many warnings should be discarded before the manual inspection phase because they are almost impossible to get triggered in real-world, and it is often overlooked by current static analysis techniques.
Authored by Haolai Wei, Liwei Chen, Xiaofan Nie, Zhijie Zhang, Yuantong Zhang, Gang Shi
Measurement and Metrics Testing - Software testing is one of the most critical and essential processes in the software development life cycle. It is the most significant aspect that affects product quality. Quality and service are critical success factors, particularly in the software business development market. As a result, enterprises must execute software testing and invest resources in it to ensure that their generated software products meet the needs and expectations of end-users. Test prioritization and evaluation are the key factors in determining the success of software testing. Test suit coverage metrics are commonly used to evaluate the testing process. Soft Computing techniques like Genetic Algorithms and Particle Swarm Optimization have gained prominence in various aspects of testing. This paper proposes an automated Genetic Algorithm approach to prioritizing the test cases and the evaluation through code coverage metrics with the Coverlet tool. Coverlet is a.NET code coverage tool that works across platforms and supports line, branch, and method coverage. Coverlet gathers data from Cobertura coverage test runs, which are then utilized to generate reports. Resultant test suits generated were validated and analyzed and have had significant improvement over the generations.
Authored by Baswaraju Swathi
Measurement and Metrics Testing - Due to the increasing complexity of modern heterogeneous System-on-Chips (SoC) and the growing vulnerabilities, security risk assessment and quantification is required to measure the trustworthiness of a SoC. This paper describes a systematic approach to model the security risk of a system for malicious hardware attacks. The proposed method uses graph analysis to assess the impact of an attack and the Common Vulnerability Scoring System (CVSS) is used to quantify the security level of the system. To demonstrate the applicability of the proposed metric, we consider two open source SoC benchmarks with different architectures. The overall risk is calculated using the proposed metric by computing the exploitability and impact of attack on critical components of a SoC.
Authored by Sujan Saha, Joel Mbongue, Christophe Bobda
Measurement and Metrics Testing - We continue to tackle the problem of poorly defined security metrics by building on and improving our previous work on designing sound security metrics. We reformulate the previous method into a set of conditions that are clearer and more widely applicable for deriving sound security metrics. We also modify and enhance some concepts that led to an unforeseen weakness in the previous method that was subsequently found by users, thereby eliminating this weakness from the conditions. We present examples showing how the conditions can be used to obtain sound security metrics. To demonstrate the conditions’ versatility, we apply them to show that an aggregate security metric made up of sound security metrics is also sound. This is useful where the use of an aggregate measure may be preferred, to more easily understand the security of a system.
Authored by George Yee
Measurement and Metrics Testing - Any type of engineered design requires metrics for trading off both desirable and undesirable properties. For integrated circuits, typical properties include circuit size, performance, power, etc., where for example, performance is a desirable property and power consumption is not. Security metrics, on the other hand, are extremely difficult to develop because there are active adversaries that intend to compromise the protected circuitry. This implies metric values may not be static quantities, but instead are measures that degrade depending on attack effectiveness. In order to deal with this dynamic aspect of a security metric, a general attack model is proposed that enables the effectiveness of various security approaches to be directly compared in the context of an attack. Here, we describe, define and demonstrate that the metrics presented are both meaningful and measurable.
Authored by Ruben Purdy, Danielle Duvalsaint, R. Blanton
MANET Security - The detection and maintenance of the pathway from the source to the destination or from one node to another node is the major role played by the nodes in the MANET. During their period, nodes arrive or leave the network, and endlessly modify their comparative location. The dynamic nature introduces several security issues. Secure routing protocol is a significant area for attaining better security in the network by keeping the routing protocols against attacks. Thus, this research work focuses on developing a secure routing protocol for MAN ET. Here, a dynamic anomaly detection scheme has proposed to detect against malicious attacks in the network. This scheme has been incorporated with AODV protocol to enhance the performance of AODV in disseminating packets to target node. In this research work Protected AODV (PAODV) is protocol is introduced to identify the false alarm node in the network and route path for reliable communication between the source to destination. Simulation results it shows the detection rate, Packet drop rate and delay is minimized compare to the existing technique.
Authored by Jebakumar D, E.P. Prakash, Dhanapal R, Aby Thomas, K. Karthikeyan, P. Poovizhi
MANET Security - Recently, the mobile ad hoc network (MANET) has enjoyed a great reputation thanks to its advantages such as: high performance, no expensive infrastructure to install, use of unlicensed frequency spectrum, and fast distribution of information around the transmitter. But the topology of MANETs attracts the attention of several attacks. Although authentication and encryption techniques can provide some protection, especially by minimizing the number of intrusions, such cryptographic techniques do not work effectively in the case of unseen or unknown attacks. In this case, the machine learning approach is successful to detect unfamiliar intrusive behavior. Security methodologies in MANETs mainly focus on eliminating malicious attacks, misbehaving nodes, and providing secure routing.
Authored by Wafa Bouassaba, Abdellah Nabou, Mohammed Ouzzif
MANET Security - The current stady is confined in proposing a reputation based approach for detecting malicious activity where past activities of each node is recorded for future reference. It has been regarded that the Mobile ad-hoc network commonly called as (MANET) is stated as the critical wireless network on the mobile devices using self related assets. Security considered as the main challenge in MANET. Many existing work has done on the basis of detecting attacks by using various approaches like Intrusion Detection, Bait detection, Cooperative malicious detection and so on. In this paper some approaches for identifying malicious nodes has been discussed. But this Reputation based approach mainly focuses on sleuthing the critcal nodes on the trusted path than the shortest path. Each node will record the activity of its own like data received from and Transferred to information. As soon as a node update its activity it is verified and a trust factor is assigned. By comparing the assigned trust factor a list of suspicious or malicious node is created..
Authored by Prolay Ghosh, Dhanraj Verma
MANET Security - Remote correspondence innovations are assuming a critical part in the plan and execution of Mobile Ad hoc Network (MANET). The portrayal of MANET, for example, dynamism in geography, restricted transfer speed and power usage expands the unlicensed correspondence advancements and intricacies in existing conventions. This paper analyzes the current and not so distant future Wireless correspondence Technologies in the 2.4 GHz band. Additionally, this paper thinks about the features and limits of those advances lastly closes with the need for the improvement of reasonable brought together convention for existing and future remote advances. It has been considered that the overview and correlation introduced in this paper would help specialists and application engineers in choosing a fitting innovation for MANET administrations.
Authored by Seema Barda, Prabhjot Manocha
MANET Security - Mobile ad hoc networks can expand access networks service zones and offer wireless to previously unconnected or spotty areas. Ad hoc networking faces transmission failures limited wireless range, disguised terminal faults and packet losses, mobility-induced route alterations, and battery constraints. A network layer metric shows total network performance. Ad-hoc networking provides access networks, dynamic multi-hop architecture, and peer-to-peer communication. In MANET, each node acts as a router, determining the optimum route by travelling through other nodes. MANET includes dynamic topology, fast deployment, energy-restricted operation, and adjustable capacity and bandwidth. Dynamic MANET increases security vulnerabilities. Researchers have employed intrusion detection, routing, and other techniques to provide security solutions. Current technologies can t safeguard network nodes. In a hostile environment, network performance decreases as nodes increase. This paper presents a reliable and energy-efficient Firefly Energy Optimized Routing (IFEOR)-based routing method to maximise MANET data transmission energy. IFEOR measures MANET firefly light intensity to improve routing stability. The route path s energy consumption determines the firefly s brightness during MANET data packet transfer. Adopting IFEOR enhanced packet delivery rates and routing overheads. End-to-end delay isn t reduced since nodes in a route may be idle before sending a message. Unused nodes use energy.
Authored by Morukurthi Sreenivasu, Badarla Anil