- Jintu K Joseph
- August 6, 2025
Understanding Security and Safety in Modern SoC Designs | Part 2
If you haven’t read Part 1 yet, check it out here: Understanding Security and Safety in Modern SoC Designs | Part 1 to get the foundational insights before diving in.
1. Security Mechanisms for Surface-Level Attack Prevention
To protect SoCs from surface-level attacks, several robust security mechanisms are employed. These features are designed to prevent unauthorised access, ensure data integrity, and maintain the trustworthiness of the system throughout its operation.
- Secure Boot with Digital Signature Verification: Secure Boot is a process that ensures only trusted software runs when a device starts up. When the SoC powers on, it begins by loading a small piece of code from read-only memory, known as the bootloader. Before this code executes, the chip checks its digital signature—like a security stamp—using cryptographic methods. If the signature matches what’s expected, the software is considered authentic and allowed to run. If it doesn’t match, the boot process is halted. This prevents attackers from inserting malicious code into the early stages of the device startup.
- Memory Protection & Access Controls: In an SoC, different parts of the chip often need memory access, but not all parts should have access to everything. Memory protection mechanisms enforce rules about which parts of the system can access which regions of memory. Access control blocks help restrict certain operations, ensuring that sensitive data is only available to authorised components or software. This helps prevent accidental leaks, unauthorised access, or deliberate tampering.
- Regular Signed Firmware Updates: Firmware is the software embedded in hardware components, and it occasionally needs updates to fix bugs or add features. However, if updates are not secure, they can be hijacked by attackers to install malicious code. That’s why SoCs rely on digitally signed firmware updates. Just like with Secure Boot, the system checks the digital signature of new firmware before installing it. This ensures that only verified and trusted firmware, usually issued by the device manufacturer, can be accepted.
- Runtime Monitoring & Anomaly Detection: Even after the system has booted securely and is running trusted firmware, attacks can still occur at runtime. Runtime monitoring tools constantly check the behaviour of the chip, watching for signs of unusual or unexpected activity. For example, if a part of the chip suddenly starts accessing memory it has never used before, or if there’s a sudden spike in communication over an interface, the system can flag it as a potential threat. Anomaly detection systems may then trigger alarms, restrict certain operations, or even shut down the chip to prevent further damage.
Together, these mechanisms form a strong defence against surface-level attacks. They make sure that only authorised code runs, sensitive data is protected, updates are safe, and unusual activity is quickly spotted and handled.
2. Understanding Hardware Trojans in SoCs
Hardware Trojans (HTs) are hidden threats inserted into a chip during its design or manufacturing process. These are not bugs or mistakes—they are intentional modifications made to perform harmful actions. A Trojan can silently leak secret data, shut down a part of the chip, or slow down the system, all without being noticed during normal operation. The danger is that they often remain hidden until triggered under very specific conditions.
Different Types of Hardware Trojans:
- Combinational Trojans: These are triggered only when a very rare combination of inputs occurs—something that rarely happens during testing. Once activated, they might leak data or disrupt operations. Think of it like a secret door that opens only when someone presses the right buttons in a certain order.
- Sequential Trojans: These Trojans wait for a specific sequence of events to happen before activating. For example, they might only turn on after the chip has performed a task five times in a row. Because of this, they are even harder to detect and often go unnoticed during normal testing.
- Functional Trojans: These directly interfere with how the chip works. They may change the output of a calculation, corrupt data, or cause wrong behaviour. Unlike the hidden ones, these are more active and can affect the chip’s performance or correctness when triggered.
- Leakage Trojans: These are designed to secretly send out sensitive information, such as passwords or encryption keys. They don’t change how the chip behaves but use methods like power usage or electromagnetic signals to leak data to an attacker.
- DoS (Denial of Service) Trojans: These Trojans are meant to shut down or crash the chip entirely. They can block critical operations or overload parts of the chip, making it stop working. In safety-critical systems, this can be extremely dangerous.
- Backdoors & Parametric Changes: Backdoors are hidden access points that allow attackers to secretly control parts of the chip. Parametric changes involve small tweaks in hardware settings that degrade performance or make the chip fail earlier than expected. Both are stealthy and hard to detect.
Potential Insertion Points for Hardware Trojans
- Design Phase (via IP cores, test logic): Trojan code can be hidden in third-party IP blocks or extra logic added for testing. If not carefully reviewed, this malicious content can sneak into the final design.
- Synthesis or Layout (modified tools): Even if the design is clean, Trojans can be added during the transformation from code to physical layout. This can happen if the tools used for synthesis or layout have been tampered with.
- Fabrication & Packaging (foundry-level manipulation): At the very last stage, during chip manufacturing or packaging, someone with access to the foundry can make subtle changes to introduce a Trojan. These changes are extremely hard to detect once the chip is produced.
- Trigger-Based vs. Always-On Trojans: Most Trojans are designed to activate only under rare input conditions, making them stealthy. However, some simpler Trojans are always active, silently leaking information or disrupting functions continuously.
- Stealth Mechanisms: To avoid detection during testing, HTs are often hidden using techniques like:
- Very low gate count additions
- Spread-out logic over wide areas
- Activation only under ultra-rare conditions.
In today’s era of cyber-physical systems (CPS) and globally distributed IC design, the security landscape has grown increasingly complex. The diverse and wide-ranging applications of integrated circuits (ICs), combined with outsourcing to third-party manufacturers, have introduced new vulnerabilities. Unfortunately, hardware—just like software—can be subjected to malicious attacks. Manufacturing tools and IP cores from untrusted sources carry enormous risks due to their highly integrated nature.
Because of this distributed manufacturing model, malicious circuits—known as Hardware Trojans—can be implanted at critical stages of IC design and fabrication. These Trojans can alter the intended functionality of a chip, leak sensitive data, or even launch denial-of-service (DoS) attacks.
In summary, Hardware Trojans are like hidden traps planted inside the chip. They can be passive or active, and they can be placed at almost any stage of development. Because they are so sneaky, detecting and preventing them requires careful checks, trusted tools, and secure supply chains.
3. Analog and Mixed-Signal Security Challenges
When we talk about securing a System-on-Chip (SoC), most of the time we focus on the digital parts—like processors, memory, and communication interfaces. But many SoCs also include analog and mixed-signal components, especially in applications like automotive systems, mobile phones, and IoT devices. These parts handle real-world signals—like temperature, pressure, sound, or voltage—and can also be targeted by attackers in unique ways.
- Analog Trojans and Signal Manipulation: Just like digital hardware Trojans, analog Trojans are malicious changes made to the chip, but they work by affecting electrical signals instead of logic operations. For example, an attacker might modify a circuit so that it changes how voltage or current behaves under certain conditions. This could lead to incorrect sensor readings or system instability, and it can be very hard to detect because analog behaviours are often more complex and less predictable.
- Sensor Spoofing and Interface Attacks: Many SoCs connect to external sensors, like temperature sensors, cameras, or accelerometers. If an attacker can send fake signals into these interfaces, they can trick the system into thinking something false is happening. This is called sensor spoofing. For instance, a car’s system might be tricked into thinking the engine is overheating when it’s not, or that a door is open when it’s actually closed. These attacks exploit the trust that systems place in sensor data.
- Power Integrity and Signal Noise-Based Attacks: SoCs rely on clean, stable power to function correctly. Attackers can intentionally create tiny variations in the power supply or introduce electrical noise into the system. These subtle disturbances can cause errors in how the chip processes analogue signals. In some cases, this can lead to incorrect behaviour or the leak of sensitive data. For example, small changes in voltage might make encryption circuits behave slightly differently, giving clues to an attacker through side-channel analysis.
- Countermeasures for Mixed-Signal Components: Protecting analog and mixed-signal parts of the chip requires a different approach than securing digital logic. Some countermeasures include:
1. Using filters and shields to block fake or noisy signals
2. Adding redundant sensors to double-check critical readings
3. Performing analog signal validation in secure hardware
4. Isolating sensitive analog paths from untrusted components
5. Monitoring power supplies for abnormal behaviour
In short, analog and mixed-signal security is a growing area of concern. As more devices rely on sensors and physical-world data, attackers may focus on these weaker spots. Designers need to treat these components with the same care as digital circuits to keep the whole system safe and reliable.
4. Detection and Mitigation Strategies in SoCs
To protect against various types of threats in SoCs—whether they are hardware Trojans, surface-level attacks, or side-channel attacks—engineers use a variety of techniques. Each threat type requires a different set of tools and strategies to detect and defend against malicious activities.
For Hardware Trojans:
- Side-Channel Analysis (SCA): This technique looks for unusual patterns in things like power usage, timing, or electromagnetic signals during chip operation. If a Trojan is present, it might cause slight differences compared to a clean chip. Detecting these differences can help identify hidden modifications.
- Golden Model Comparison: A golden model is a known-good version of the chip design. By comparing the behaviour or layout of a manufactured chip with the golden model, engineers can detect any extra or altered logic that may signal the presence of a Trojan.
- Runtime Watchdogs & Self-checks: Watchdog circuits and self-checking logic run inside the chip during its operation. They monitor the behaviour of different components and can raise alarms or reset the system if something unexpected happens, such as a Trojan trying to activate.
- Supply Chain Hardening & Split Manufacturing: To prevent tampering during production, companies secure their supply chain by using trusted vendors and conducting audits. Split manufacturing involves producing different parts of the chip at separate facilities, so no single party has access to the full design, making it harder to insert a Trojan.
For Surface-Level Attacks:
- Secure Boot with Digital Signature Verification: Ensures that only verified software runs when the chip powers on. It stops attackers from inserting malicious firmware during the boot process.
- Memory Protection & Access Controls: Prevents unauthorised components or users from accessing sensitive memory areas. This keeps confidential data safe from both accidental exposure and deliberate attacks.
- Regular Signed Firmware Updates: Uses digital signatures to ensure that only updates from trusted sources are accepted. This blocks fake or malicious firmware from being installed.
- Runtime Monitoring & Anomaly Detection: Continuously observes the chip’s activity for anything unusual. If a part of the chip behaves abnormally, the system can take immediate action to prevent further damage.
For Side-Channel Attacks:
- Constant-Time Algorithms: These algorithms take the same amount of time to run, no matter what data is being processed. This prevents attackers from learning secrets by measuring time differences.
- Noise Injection: Random signals or operations are added to mask the real behaviour of the chip. This makes it harder for attackers to pick out useful patterns from power usage or timing.
- Power and Clock Randomisation: Varying the chip’s power supply or clock speed at random intervals makes it more difficult to gather accurate data for a side-channel attack.
- Shielding and Physical Isolation: Special shielding materials and layout techniques can block electromagnetic emissions or reduce leakage from sensitive areas of the chip.
Together, these strategies form a multi-layered defence system. By combining detection and mitigation methods across all levels—hardware, firmware, and runtime—SoCs can be made much more resistant to modern security threats.
5. RISC-V vs. CISC: Security and Safety Considerations
When designing a secure and safe System-on-Chip (SoC), the choice of processor architecture can have a big impact. Two major options are RISC-V and CISC (such as x86). Each has strengths and weaknesses when it comes to security and safety.
RISC-V Architecture:
- RISC-V is an open-source processor architecture, which means anyone can view, modify, and use it without paying licensing fees. This openness allows designers to customise the processor to meet specific safety and security needs. For example, extra checks or redundant safety logic can be added for use in critical systems like automotive or medical devices.
- One advantage of RISC-V’s openness is that it is easily inspectable. Security experts and developers can closely examine the source code and hardware design to look for bugs, backdoors, or vulnerabilities. This transparency promotes trust and makes it easier to fix problems early in the design process.
- However, because RISC-V is relatively new, it may lack mature certifications that are needed for safety-critical applications. In industries like aerospace or automotive, regulatory standards require proven and thoroughly tested hardware. RISC-V’s flexibility also means designers must manually add security features such as PMP (Physical Memory Protection), which controls memory access. If these features are not implemented properly, it could lead to vulnerabilities.
CISC Architecture (e.g., x86):
- CISC stands for Complex Instruction Set Computing, and x86 is a well-known example used in most desktop and server computers. x86 processors are proprietary and come from companies like Intel and AMD. These processors are packed with advanced security features, such as Intel SGX (Software Guard Extensions) and Secure Boot mechanisms that help protect against a wide range of attacks.
- Because x86 processors are built and maintained by large, experienced companies, they benefit from years of security testing and robust support. They are often better suited for use in complex systems that need high-end performance and protection.
- On the downside, CISC processors are closed-source, so users cannot easily inspect the internals or verify that there are no hidden vulnerabilities. Also, their complexity makes it harder to certify them for safety-critical use, as validating every possible behaviour is a massive task.
Summary:
- RISC-V offers flexibility, transparency, and ease of customisation but may require extra work to meet certification and security requirements.
- x86 provides mature security features and strong support, but comes with limited visibility and added complexity.
Choosing between RISC-V and CISC depends on the project’s goals—whether flexibility and openness are more important, or whether proven security features and support are the priority.
6. Functional Safety and Security in SoCs
Functional safety and functional security are two pillars of reliable SoC design. Although they focus on different types of risks, both are essential to ensure the safe and trustworthy operation of modern electronic systems.
Functional safety is all about protecting the system from unintentional failures. These are issues that can arise from things like electrical faults, ageing hardware, or software bugs. In critical environments—such as automotive, industrial, or medical systems—these failures can lead to serious consequences, including injury or even loss of life. To prevent this, designers add specific hardware and software mechanisms to detect faults and respond appropriately. For example, lockstep cores run duplicate operations and compare results in real-time to catch errors. ECC (Error-Correcting Code) memory automatically fixes data corruption, and watchdog timers reset the system if it freezes or misbehaves. Safety islands within the chip can operate independently, monitoring the rest of the system and initiating emergency actions if something goes wrong.
Functional security, on the other hand, deals with intentional threats—people trying to break into the system, steal data, or cause harm. These attacks might target the chip’s firmware, communication channels, or even physical hardware. To defend against them, SoCs include features like secure boot chains, which ensure that only trusted software can run when the device powers on. Trusted Execution Environments (TEEs) create isolated regions for sensitive tasks, keeping them separate from the main operating system. Cryptographic engines help encrypt and decrypt data securely, while tamper detection circuits protect against physical intrusion. Access control firewalls prevent unauthorised parts of the system from accessing sensitive areas.
While functional safety and security have different goals, they are deeply connected. A failure in safety mechanisms can create vulnerabilities that attackers might exploit. Similarly, a successful security breach can trigger unsafe behaviour. That’s why modern SoC design treats these two aspects as equally important and often integrates them into a unified architecture.
In a world where SoCs are becoming the brains of everything from smart cars to smart homes, ensuring both safety and security is no longer optional—it’s the foundation of user trust and system reliability.
Functional Safety Features
- To help SoCs detect and respond to failures, several safety features are built directly into the hardware. These features are designed to catch problems before they cause serious damage, especially in systems where safety is critical.
- Lockstep Cores for Redundancy: Lockstep cores are like having two processors doing the same job at the same time. They both run the same instructions in parallel, and their results are constantly compared. If their outputs ever differ, it means something went wrong—like a fault or error—and the system can quickly take action. This is useful in safety-critical systems like aeroplanes or cars, where even a tiny error could have big consequences.
- ECC Memory to Detect/Correct Bit Errors: ECC stands for Error-Correcting Code. ECC memory is smart memory that doesn’t just store data—it also adds extra bits that help check if the data has changed by mistake. If it detects a small error (like a single bit flipping from 0 to 1), it can fix it automatically. This is important because memory errors can happen randomly, due to heat, ageing, or cosmic rays, and fixing them prevents system crashes.
- Watchdog Timers for Automatic Recovery: A watchdog timer is like a safety clock that keeps checking if the system is still running properly. If the system stops responding or gets stuck, and the timer isn’t reset in time, the watchdog assumes something is wrong and forces the system to restart. This automatic recovery helps prevent long downtimes or dangerous situations caused by frozen systems.
- Safety Islands for Independent Monitoring: Safety islands are special, isolated sections within the chip that work independently from the rest of the system. Their job is to monitor everything and stay alert for errors or abnormal behaviour. Since they are separated from the main processing units, they can still function and take control even if the rest of the chip fails. This isolation makes them reliable guardians in safety-critical applications.
These features work together to make sure the SoC can detect, correct, and recover from unexpected problems, keeping systems safe and reliable.
7. Role of Verification Engineers in SoC Security
Verification engineers play a key role in making sure that System-on-Chip (SoC) designs are not only working correctly but are also protected from security threats. Their job is to test and confirm that the chip does exactly what it is supposed to do—and nothing more—before and after it is manufactured.
- Defining and Validating Security Requirements: Before testing begins, verification engineers help define what “secure” means for a particular chip. This could include rules like who can access certain parts of memory or what should happen if an unknown piece of software tries to run. Once these rules are defined, they create tests to make sure the chip follows them.
- Simulating Attack Scenarios and Fuzz Testing: Verification engineers simulate attacks on the chip to see how it reacts. They might test what happens if someone tries to send strange or unexpected data through a USB port or attempts to read protected memory. Fuzz testing involves sending lots of random data to the system to see if it crashes or behaves strangely—this helps uncover hidden bugs or weak points.
- Threat Modelling and Vulnerability Analysis: They also help build models of how someone might try to attack the chip. This process, called threat modelling, helps identify where the most vulnerable parts of the design are. After that, they perform vulnerability analysis to test and strengthen those weak spots.
- Ensuring Compliance with Security Standards: There are industry standards for security, such as FIPS 140-2 or IEC 62443, that chips must meet, especially in government, automotive, or industrial systems. Verification engineers help make sure the chip follows these standards by designing tests and collecting the right evidence for certification.
- Using Emulation and Formal Verification to Detect Hardware Trojans (HTs): To catch hidden threats like hardware Trojans—malicious changes to the chip—verification engineers use advanced tools. Emulation lets them test the chip in a simulated real-world environment, while formal verification uses mathematics to prove that the chip will behave correctly in all cases.
Their work is important throughout the chip’s entire life, from the early design phase (pre-silicon) to after it is built and running in a real device (post-silicon). Without their careful testing and security checks, chips could go to market with serious flaws that put users and systems at risk.