SEARCH

— 葡萄酒 | 威士忌 | 白兰地 | 啤酒 —

Network Packet Loss: One Article Sufficient for Network Engineers

BLOG 380

00c0909eeae57f737f79953544f19260

1. What is Network Packet Loss?

As the name suggests, network packet loss refers to the situation where some data packets fail to successfully transmit from the source address to the destination address during data transmission, getting “lost” or “discarded” midway.
Packet loss is not the same as “network disconnection,” but it directly affects network quality, such as:

 

  • Extremely slow web page loading
  • Lag and high latency in video conferences
  • Interrupted file transfers or failed downloads
  • High gaming latency and character teleportation

2. Where Does Packet Loss Occur?

Data packets travel through numerous devices and paths from the source to the destination. Packet loss can occur if any link in the chain has issues:

Common Causes by Location:

  • Local Devices (e.g., PC, network card): Driver anomalies, aging network cards, or resource exhaustion.
  • Access Layer (switches): Port congestion, broadcast storms, or loops.
  • Distribution/Core Layers: High CPU usage in devices, abnormal interface transmission/reception.
  • Firewalls: ACL or policy misinterceptions, resource bottlenecks.
  • Exit Links: Poor ISP-side quality with severe packet loss.
  • Cloud Services: Packet loss on the cloud side (beyond user control).

3. How to Distinguish “True Packet Loss” from “False Packet Loss”?

Many novices immediately assume packet loss upon seeing failed pings or slow responses. However, be cautious of “false packet loss”:

 

  • Firewalls or ACLs blocking ICMP do not mean service packets are truly lost.
  • Packet loss at an intermediate hop but normal destination access does not indicate real loss.
  • Brief network jitters (e.g., link convergence) may cause transient loss, not stability issues.

 

Key points: True packet loss typically meets these criteria:

 

  • Persistent (not sporadic)
  • Consistent across multiple tools (ping, iperf, packet capture)
  • Stable and reproducible loss location

4. What Are the Common Detection Tools?

  1. ping
    The most basic packet loss testing tool. Sends ICMP packets to check round-trip normality.
    bash
    ping 192.168.1.1 -t          # Continuous ping  
    ping -n 50 8.8.8.8          # 50 ping attempts  
    
    Note: ICMP blocking on some devices does not mean network disconnection.
  2. tracert/traceroute
    Determines at which hop data starts dropping.
    bash
    tracert www.baidu.com       # Windows  
    traceroute www.baidu.com    # Linux  
    
  3. iperf
    A professional performance test tool supporting UDP packet loss detection for higher accuracy.
    bash
    iperf3 -c 192.168.1.100 -u -b 10M -t 10  # UDP test at 10Mbps for 10s  
    
  4. Packet Capture (Wireshark/tcpdump)
    The core verification method to check if data is sent and acknowledged.

5. What Causes Network Packet Loss?

Packet loss stems from two main categories: network device issues and link quality problems. Let’s elaborate:

1. Bandwidth Saturation (Congestion-Induced Loss)

  • Principle: Interfaces cannot process excessive data, so full buffers force packet dropping.
  • Common scenarios:
    • Massive traffic (e.g., backups, video transfers) exceeding port bandwidth.
    • Cross-VLAN access saturating the core switch’s uplink port.
  • Judgment methods:
    • Check interface bandwidth utilization (e.g., display interface).
    • Monitor port traffic (e.g., display counters interface).

2. Interface Errors and Physical Issues

  • Symptoms: Loose patch cables, poor optical module contact, or incorrect twisted-pair wiring can cause intermittent loss.
  • Key indicators:
    • CRC errors (cyclic redundancy check)
    • Input/Output drops
  • Troubleshooting suggestions:
    • Check error packet statistics via display interface brief.
    • Re-plug patch cables or test with new twisted-pair wires.

3. High CPU/Memory Usage

  • Insufficient device processing capacity leads to loss.
  • Common in:
    • High master control pressure in multi-device stacking.
    • Firewalls crashing under concurrent policies, NAT, and sessions.
    • Routers with 飙升 CPU affecting forwarding capability.
  • Troubleshooting methods:
    • Check device CPU usage (display cpu-usage).
    • Verify excessive forwarding entries (e.g., ARP table, MAC table).

4. Broadcast Storms/Loop Issues

  • Typical symptoms: Widespread network outages, including failed management port pings.
  • Investigation directions:
    • Check if STP is enabled and loop protection is effective.
    • Capture packets for excessive repeated broadcasts (storm).

5. Policy Misinterceptions (ACL, Firewall)

  • Sometimes perceived as “loss,” but traffic is actually rejected by policies.
  • Checkpoints:
    • Verify if ACL rules allow the traffic.
    • Confirm firewall discard policies.
  • Case example: A client experienced server timeouts due to a switch ACL blocking TCP port 443 (HTTPS).

6. Scientific Troubleshooting Logic: A Flowchart

[Terminal Loss?]  
↳ Check local NIC drivers, utilization, ARP  
[Access Layer Loss?]  
↳ Check port traffic, CRC, MAC learning  
[Distribution/Core Layer Loss?]  
↳ Check link load, policy configuration, NAT forwarding  
[Exit Loss?]  
↳ ISP line quality, SLA, external speed tests  
[Application Layer Misjudgment?]  
↳ Application bugs, session control, short timeouts  

7. High-Frequency Packet Loss Scenarios and Case Summaries

Case 1: Intermittent Loss During Link Instability

  • Symptoms: 50% ping success rate, slow web loading.
  • Causes:
    • Loose network cables
    • Interface negotiation anomalies (Gigabit vs. 100Mbps)
    • Firewall ICMP Flood protection limiting responses
  • Solutions:
    • Replace cables, ensure consistent negotiation.
    • Adjust firewall policies to relax ICMP detection frequency.

Case 2: Post-Power-On Switch Communication Failure

  • Symptoms: Switch cannot ping any host for minutes after startup.
  • Causes:
    • Configuration loading time during startup
    • STP (Spanning Tree) not converged, ports in blocking state
  • Solutions:
    • Use spanning-tree portfast (Cisco) or stp edged-port enable (Huawei) to accelerate port activation.
    • Test after STP fully converges.

Case 3: Five-port Switch Only Supports Four Ports

  • Symptoms: One port fails when the fifth is plugged in.
  • Causes:
    • Inadequate power supply
    • Aging chip or hardware failure
  • Solutions:
    • Replace the switch.
    • Test chip power supply and current fluctuations with professional tools.

Case 4: Switch “COL” Light On/Flashing, No Communication

  • Symptoms: Abnormal port communication, severe loss in packet capture.
  • Causes:
    • Collisions! (Indicated by the collision light)
    • Port connected to non-full-duplex devices, negotiation failure
  • Solutions:
    • Manually specify consistent duplex modes.
    • Replace cables or outdated devices to avoid incompatibility.

Case 5: Frequent Service Disconnections After Upgrading to Gigabit

  • Symptoms: Intermittent server connections on Gigabit links, frequent retransmissions in captures.
  • Causes:
    • Inadequate cable/module quality for Gigabit links
    • Unlocked port speed causing unstable negotiation
  • Solutions:
    • Use Cat6+ cables.
    • Manually lock to Gigabit full-duplex.
    • Update NIC drivers and switch firmware.

Case 6: Severe Cross-VLAN Communication Loss

  • Symptoms: Normal intra-VLAN, but ping loss across VLANs.
  • Causes:
    • Incorrect Layer 3 VLAN interface configurations
    • ACLs restricting traffic
    • Stale ARP table entries
  • Solutions:
    • Verify VLAN interface IPs, subnets, and routes.
    • Clear ARP cache for re-learning.
    • Capture packets to check ICMP filtering.

8. How to Prevent Packet Loss at the 萌芽 Stage (Bud Stage)?

  1. Reliable device selection:
    • Avoid low-end switches in high-concurrency environments.
    • Use QoS- and hardware-forwarding-supported devices for critical nodes.
  2. Regular inspection mechanisms:
    • Periodically check CPU, memory, interface traffic, and error packets.
    • Implement SNMP + network management platforms for 7×24 alerting.
  3. Site environment considerations:
    • Maintain 机房 (server room) temperature at 20–25°C.
    • Ensure clean power, reliable grounding, and static electricity prevention.
  4. Standardized configurations and documentation:
    • Log every modification and rollback plan.
    • Use configuration templates to avoid human errors.
  5. Troubleshooting triad + packet capture:
    • Prioritize capturing ARP, ICMP, and TCP handshakes.
    • Use ping + traceroute + iperf in combination.
    • Verify DNS, VLAN, ACL, and routes are error-free.

 

Don’t fear packet loss—fear not knowing how to troubleshoot!
Network packet loss is not complex, but it tests your understanding of overall network architecture, familiarity with device mechanisms, and proficiency in tool usage. The more systematic and professional you are, the more effectively you can tackle it.
The prev: The next:

Related recommendations

Expand more!

Mo