[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US20200162396A1 - Application-level network queueing - Google Patents

Application-level network queueing Download PDF

Info

Publication number
US20200162396A1
US20200162396A1 US16/773,801 US202016773801A US2020162396A1 US 20200162396 A1 US20200162396 A1 US 20200162396A1 US 202016773801 A US202016773801 A US 202016773801A US 2020162396 A1 US2020162396 A1 US 2020162396A1
Authority
US
United States
Prior art keywords
nic
queue
application
network packet
incoming
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/773,801
Inventor
Anil Vasudevan
Kiran A. Patil
Arun Chekhov Ilango
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Priority to US16/773,801 priority Critical patent/US20200162396A1/en
Publication of US20200162396A1 publication Critical patent/US20200162396A1/en
Priority to US17/566,381 priority patent/US11706151B2/en
Priority to US18/201,068 priority patent/US20230300078A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • H04L47/62Queue scheduling characterised by scheduling criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/50Queue scheduling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/125Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2475Traffic characterised by specific attributes, e.g. priority or QoS for supporting traffic characterised by the type of applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/9063Intermediate storage in different physical parts of a node or terminal
    • H04L49/9068Intermediate storage in different physical parts of a node or terminal in the network interface card
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Definitions

  • This disclosure relates in general to the field of computer networking, and more particularly, though not exclusively to, a system and method for application-level network queueing.
  • Busy polling also known as “busy waiting” or “spinning,” is a technique in which a process repeatedly monitors a condition to see if it has become true. For example, if the process needs input, it may continually loop, watching an input buffer to see if data have become available. Other uses may be to see if a key has been pressed, or if a lock is available. Another use is to create a known time delay, such as by “doing nothing” for a specified number of milliseconds, though variations in processor speed may make this not completely reliable.
  • Busy polling may be a useful technique particularly in low-level programming, where interrupt-driven processing may be inefficient or cumbersome.
  • FIG. 1 is a block diagram of a security-enabled network according to one or more examples of the present specification.
  • FIG. 2 is a block diagram of a client device according to one or more examples of the present specification.
  • FIG. 3 is a block diagram of a server device according to one or more examples of the present specification.
  • FIG. 4 is a block diagram of a network queueing system according to one or more examples of the present specification.
  • FIG. 5 is a block diagram of a network queueing system according to one or more examples of the present specification.
  • FIG. 6 is a block diagram of an Open Systems Interconnection model (OSI) seven-layer stack according to one or more examples of the present specification.
  • OSI Open Systems Interconnection model
  • FIG. 7 is a flow chart of a method performed by a network device according to one or more examples of the present specification.
  • FIG. 8 is a flow chart of a method performed by an application according to one or more examples of the present specification
  • FIG. 9 is a flow chart of a method performed by an application according to one or more examples of the present specification.
  • FIG. 10 is a flow chart of a transmit operation according to one or more examples of the present specification.
  • a network interface card comprising: an ingress interface to receive incoming traffic; a plurality of queues to queue incoming traffic; an egress interface to direct incoming traffic to a plurality of server applications; and a queuing engine, including logic to: uniquely associate a queue with a selected server application; receive an incoming network packet; determine that the selected server application may process the incoming network packet; and assign the incoming network packet to the queue.
  • modern day data center tiers may include web servers and data caching servers, with “Big Data” backends like the “Hadoop” software framework. These may be networked using leaf spine architectures. In one example, these tiers are distributed systems in and of themselves and typically service many thousands of concurrent connections, with the number of connections expected to increase in the future as the number of cores on the end nodes increase.
  • Some data centers may have a target performance determinism, such as to meet contractual requirements or as a competitive differentiator.
  • Providing the desired performance determinism e.g., 99.99 percentile latency, with consistent throughput and CPU utilization
  • the networking interface is one of the primary contributors to this issue, as it can become a bottleneck to application performance on a busy, high-end server.
  • ALNQ application-level network queuing
  • OSI Open Systems Interconnection
  • TCP/IP transmission control protocol/internet protocol
  • the host network stack is bypassed in whole or in part, and parallel offloaded or onloaded stacks may be run.
  • NICs divide incoming traffic into a plurality of separate “queues,” with traffic being load balanced to queues based on, for example, a hashing algorithm based on a socket tuple that includes the source media access control address (MAC), destination MAC, and/or protocol.
  • ALNQ addresses issues such as that applications may desire to have their own “visible” queues and load balancing control using the operating system's networking stack.
  • ALNQ makes this possible by creating scalable application specific queues on NICs with socket layer visibility and an affinitization and wakeup algorithm applied to kernel event multiplexing interfaces like “EPOLL” and “KQUEUE.” In selected embodiments, this leads to lower and deterministic average and tail latencies, and, higher throughput within latency tolerance boundaries, which may help in scaling distributed systems.
  • ALNQ takes a top-down view with respect to steering traffic to an application and delivering low latency communication.
  • Applications from a networking perspective, may distinguish themselves within a system by virtue of a port number.
  • an HTTP daemon may listen on port 80
  • an e-mail daemon may listen on port 25 .
  • ALNQ may use the port number (along with other relevant identifying information) as the primary key for assigning traffic to a queue, i.e. application specific queues. Traffic into these queues may be multiplexed via existing schemes like “Receive Side Scaling” (RSS) for hash-based load balancing, or Intel® Ethernet Flow Director for determinate matching.
  • RSS Receiveive Side Scaling
  • Information about the assigned queue may be passed up to the application as an opaque identifier.
  • the application may then use this identifier for alignment hints.
  • an application that uses EPOLL may align application thread handling the file descriptors to the queue, thus streamlining access and mitigating effects from cache line bouncing of socket and protocol related data structures.
  • Interrupts may also be tuned to better match application behavior, as these queues can be distributed between applications, rather than between cores, which may cause them to be shared between applications.
  • low latency as an attribute can be applied to an entire application, and therefore a queue set, rather than for a socket (or flow) that is open. This separation also allows for direct application signaling schemes for wakeup before protocol processing is performed.
  • ALNQ steers NIC traffic by assigning queues not per interface or virtual socket interface (VSI), or by VMs or traffic classes, but rather per port number, with the port number functioning as an application identifier.
  • VSI virtual socket interface
  • a port number may operate as a useful primitive for identifying an application on a particular machine.
  • the application identifier could be any suitable application identifier other than the port number.
  • ALNQ can load balance the queues assigned to an application/port number using known hash-based schemes like RSS or exact match schemes like Flow Director
  • ALNQ may pass queue information as a token to the application, such as via an enhancement to the sockets interface options.
  • applications may align threads to queues using these tokens.
  • separating application-specific traffic early provides a range of affinitization choices, such as using existing low-latency methods such as Busy Poll Sockets (BPS), providing scalable aligning of queues to application threads, or applying quality of service or security policies.
  • BPS Busy Poll Sockets
  • ALNQ enables an interrupt signaling scheme wherein interrupts signal a sleeping application thread directly, which may then pull data out from device queues.
  • FIGURES A system and method for application level network queueing will now be described with more particular reference to the attached FIGURES. It should be noted that throughout the FIGURES, certain reference numerals may be repeated to indicate that a particular device or block is wholly or substantially consistent across the FIGURES. This is not, however, intended to imply any particular relationship between the various embodiments disclosed.
  • a genus of elements may be referred to by a particular reference numeral (“widget 10 ”), while individual species or examples of the genus may be referred to by a hyphenated numeral (“first specific widget 10 - 1 ” and “second specific widget 10 - 2 ”).
  • FIG. 1 is a network-level diagram of a secured enterprise 100 according to one or more examples of the present specification.
  • secured enterprise 100 may be configured to provide services or data to one or more customers 162 , who may access information or services via external network 172 . This may require secured enterprise 100 to at least partly expose certain services and networks to the outside world, thus creating a logical security aperture.
  • Each device may include an appropriate operating system, such as Microsoft Windows, Linux, Android, Mac OSX, Apple iOS, Unix, or similar. Some of the foregoing may be more often used on one type of device than another. For example, desktop computers or engineering workstation may be more likely to use one of Microsoft Windows, Linux, Unix, or Mac OSX. Laptop computers, which are usually a portable off-the-shelf device with fewer customization options, may be more likely to run Microsoft Windows or Mac OSX. Mobile devices may be more likely to run Android or iOS. However, these examples are not intended to be limiting.
  • Client devices 110 may be communicatively coupled to one another and to other network resources via enterprise network 170 .
  • Enterprise network 170 may be any suitable network or combination of one or more networks operating on one or more suitable networking protocols, including for example, a local area network, an intranet, a virtual network, a wide area network, a wireless network, a cellular network, or the Internet (optionally accessed via a proxy, virtual machine, or other similar security mechanism) by way of nonlimiting example.
  • Enterprise network 170 may also include one or more servers, firewalls, routers, switches, security appliances, antivirus servers, or other useful network devices, which in an example may be virtualized within workload cluster 142 .
  • enterprise network 170 is shown as a single network for simplicity, but in some embodiments, enterprise network 170 may include a large number of networks, such as one or more enterprise intranets connected to the internet. Enterprise network 170 may also provide access to an external network, such as the Internet, via external network 172 . External network 172 may similarly be any suitable type of network.
  • a workload cluster 142 may be provided, for example as a virtual cluster running in a hypervisor on a plurality of rack-mounted blade servers, or as a cluster of physical servers.
  • Workload cluster 142 may provide one or more server functions, or one or more “microclouds” in one or more hypervisors.
  • a virtualization environment such as vCenter may provide the ability to define a plurality of “tenants,” with each tenant being functionally separate from each other tenant, and each tenant operating as a single-purpose microcloud.
  • Each microcloud may serve a distinctive function, and may include a plurality of virtual machines (VMs) of many different flavors, including agentful and agentless VMs.
  • VMs virtual machines
  • endpoint devices 110 may also be provided via workload cluster 142 .
  • one microcloud may provide a remote desktop hypervisor such as a Citrix workspace, which allows users 120 operating endpoints 110 to remotely login to a remote enterprise desktop and access enterprise applications, workspaces, and data.
  • endpoint 110 could be a “thin client” such as a Google Chromebook, running only a stripped-down operating system, and still provide user 120 useful access to enterprise resources.
  • Management console 140 may also operate on enterprise network 170 .
  • Management console 140 may provide a user interface for a security administrator 150 to define enterprise security policies, which management console 140 may enforce on enterprise network 170 and across client devices 110 and workload cluster 142 .
  • management console 140 may run a server-class operating system, such as Linux, Unix, or Windows Server.
  • management console 140 may be provided as a web interface, on a desktop-class machine, or via a VM provisioned within workload cluster 142 .
  • Secured enterprise 100 may communicate across enterprise boundary 104 with external network 172 .
  • Enterprise boundary 104 may represent a physical, logical, or other boundary.
  • External network 172 may include, for example, websites, servers, network protocols, and other network-based services.
  • secured enterprise 100 may simply be a family, with parents assuming the role of security administrator 150 .
  • the parents may wish to protect their children from undesirable content, such as pornography, adware, spyware, age-inappropriate content, advocacy for certain political, religious, or social movements, or forums for discussing illegal or dangerous activities, by way of non-limiting example. In this case, the parent may perform some or all of the duties of security administrator 150 .
  • Workload cluster 142 may be configured to provide any suitable network services to client devices, such as client device 110 and end user device 180 .
  • workload cluster 142 may be a rack-mount cloud system providing security and infrastructure services such as a firewall, antivirus, deep packet inspection, load balancing, and switching by way of nonlimiting example.
  • Workload cluster 142 may also provide substantive network services, such as a corporate file server, database file system, intranet server, or a database server by way of nonlimiting example.
  • An enterprise user 120 may use a client application 112 on client device 110 to access services on workload cluster 142 via enterprise network 170 .
  • End user 160 may also access network resources. End user 160 may operate an end user application 162 on end user device 180 , and access workload cluster 142 via external network 172 and enterprise network 170 . In accordance with good corporate security practice, network from outside of secured enterprise 100 may be segregated from corporate network traffic, such as by placing enterprise traffic and external traffic on different subnetworks.
  • Workload cluster 142 may provide to external end users 160 various suitable network services, such as a website over hypertext transfer protocol (HTTP), e-mail over simple mail transport protocol (SMTP), file sharing via file transfer protocol (FTP), secure FTP (SFTP), or virtual shells via protocols such as telnet or secure shell (SSH).
  • HTTP hypertext transfer protocol
  • SMTP simple mail transport protocol
  • FTP file sharing via file transfer protocol
  • SFTP secure FTP
  • SSH secure shell
  • end user 160 and corporate user 120 are disclosed here as two separate entities by way of nonlimiting example, strict segregation is not required.
  • An end user 160 may also be a corporate user, and may access corporate resources via a remote client such as Citrix or a remote desktop protocol (RDP).
  • corporate user 120 may also be a guest user with less than full access to enterprise resources.
  • workload cluster 142 is a single workload server 142 having one or more network interface cards (NICs).
  • NICs network interface cards
  • the methods disclosed herein can be scaled to any appropriate number of servers configured in any appropriate manner, such as standalone servers, service appliances, individual servers operating multiple protocols, or a plurality of virtual servers operating on one or a plurality of physical devices.
  • workload server 142 hosts both an HTTP web server receiving traffic on port 80 , and an SMTP mail server receiving traffic on port 25 .
  • Workload server 142 may open a unique communication “socket” or other logical connection for each connection with a client device.
  • a unique communication “socket” or other logical connection for each connection with a client device.
  • 100,000 HTTP sockets will be open on port 80 .
  • 50,000 e-mail users are accessing e-mail services
  • 50,000 SMTP sockets will be open on port 25 .
  • queueing can be a substantial challenge.
  • Certain embodiments of traditional “bottom-up” hash-based queueing such as RSS, in which a packet is hashed starting with the lower levels of the OSI network model, may result in substantial latency due to traffic from other applications sharing the same queues.
  • this specification provides a system and method for providing application-level network queuing to reduce latency and improve performance.
  • Example logical embodiments are illustrated in FIGS. 4 and 5
  • example methods are illustrated in FIGS. 7-9 .
  • FIG. 2 is a block diagram of client device 200 according to one or more examples of the present specification.
  • Client device 200 may be any suitable computing device.
  • a “computing device” may be or comprise, by way of non-limiting example, a computer, workstation, server, mainframe, virtual machine (whether emulated or on a “bare-metal” hypervisor), embedded computer, embedded controller, embedded sensor, personal digital assistant, laptop computer, cellular telephone, IP telephone, smart phone, tablet computer, convertible tablet computer, computing appliance, network appliance, receiver, wearable computer, handheld calculator, or any other electronic, microelectronic, or microelectromechanical device or apparatus for processing and communicating data.
  • Any computing device may be designated as a host on the network. Each computing device may refer to itself as a “local host,” while any computing device external to it may be designated as a “remote host.”
  • client devices 110 may all be examples of client devices 200 .
  • Client device 200 includes a processor 210 connected to a memory 220 , having stored therein executable instructions for providing an operating system 222 and at least software portions of a client agent 224 .
  • Other components of client device 200 include a storage 250 , network interface 260 , and peripheral interface 240 .
  • This architecture is provided by way of example only, and is intended to be non-exclusive and non-limiting. Furthermore, the various parts disclosed are intended to be logical divisions only, and need not necessarily represent physically separate hardware and/or software components.
  • Certain computing devices provide main memory 220 and storage 250 , for example, in a single physical memory device, and in other cases, memory 220 and/or storage 250 are functionally distributed across many physical devices.
  • each logical block disclosed herein is broadly intended to include one or more logic elements configured and operable for providing the disclosed logical operation of that block.
  • logic elements may include hardware, external hardware (digital, analog, or mixed-signal), software, reciprocating software, services, drivers, interfaces, components, modules, algorithms, sensors, components, firmware, microcode, programmable logic, or objects that can coordinate to achieve a logical operation.
  • processor 210 is communicatively coupled to memory 220 via memory bus 270 - 3 , which may be for example a direct memory access (DMA) bus by way of example, though other memory architectures are possible, including ones in which memory 220 communicates with processor 210 via system bus 270 - 1 or some other bus.
  • processor 210 may be communicatively coupled to other devices via a system bus 270 - 1 .
  • a “bus” includes any wired or wireless interconnection line, network, connection, bundle, single bus, multiple buses, crossbar network, single-stage network, multistage network or other conduction medium operable to carry data, signals, or power between parts of a computing device, or between computing devices. It should be noted that these uses are disclosed by way of non-limiting example only, and that some embodiments may omit one or more of the foregoing buses, while others may employ additional or different buses.
  • a “processor” may include any combination of logic elements operable to execute instructions, whether loaded from memory, or implemented directly in hardware, including by way of non-limiting example a microprocessor, digital signal processor, field-programmable gate array, graphics processing unit, programmable logic array, application-specific integrated circuit, or virtual machine processor.
  • a multi-core processor may be provided, in which case processor 210 may be treated as only one core of a multi-core processor, or may be treated as the entire multi-core processor, as appropriate.
  • one or more co-processor may also be provided for specialized or support functions.
  • Processor 210 may be connected to memory 220 in a DMA configuration via DMA bus 270 - 3 (or other similar memory interconnect).
  • memory 220 is disclosed as a single logical block, but in a physical embodiment may include one or more blocks of any suitable volatile or non-volatile memory technology or technologies, including for example DDR RAM, SRAM, DRAM, cache, L1 or L2 memory, on-chip memory, registers, flash, ROM, optical media, virtual memory regions, magnetic or tape memory, or similar.
  • memory 220 may comprise a relatively low-latency volatile main memory, while storage 250 may comprise a relatively higher-latency non-volatile memory.
  • memory 220 and storage 250 need not be physically separate devices, and in some examples may represent simply a logical separation of function. It should also be noted that although DMA is disclosed by way of non-limiting example, DMA is not the only protocol consistent with this specification, and that other memory architectures are available.
  • Storage 250 may be any species of memory 220 , or may be a separate device.
  • Storage 250 may include one or more non-transitory computer-readable mediums, including by way of non-limiting example, a hard drive, solid-state drive, external storage, redundant array of independent disks (RAID), network-attached storage, optical storage, tape drive, backup system, cloud storage, or any combination of the foregoing.
  • Storage 250 may be, or may include therein, a database or databases or data stored in other configurations, and may include a stored copy of operational software such as operating system 222 and software portions of client agent 224 . Many other configurations are also possible, and are intended to be encompassed within the broad scope of this specification.
  • Network interface 260 may be provided to communicatively couple client device 200 to a wired or wireless network, and may be a network interface card (NIC), host fabric interface (HFI), or any other suitable device.
  • a “network,” as used throughout this specification, may include any communicative platform operable to exchange data or information within or between computing devices, including by way of non-limiting example, an ad-hoc local network, an internet architecture providing computing devices with the ability to electronically interact, a plain old telephone system (POTS), which computing devices could use to perform transactions in which they may be assisted by human operators or in which they may manually key data into a telephone or other suitable electronic equipment, any packet data network (PDN) offering a communications interface or exchange between any two nodes in a system, or any local area network (LAN), metropolitan area network (MAN), wide area network (WAN), wireless local area network (WLAN), virtual private network (VPN), intranet, or any other appropriate architecture or system that facilitates communications in a network or telephonic environment.
  • LAN local area network
  • MAN
  • Client agent 224 in one example, is operable to carry out computer-implemented methods as described in this specification.
  • Client agent 224 may include one or more tangible non-transitory computer-readable mediums having stored thereon executable instructions operable to instruct a processor to provide a client agent 224 .
  • an “engine” includes any combination of one or more logic elements, of similar or dissimilar species, operable for and configured to perform one or more methods provided by the engine.
  • client agent 224 may comprise one or more logic elements configured to provide methods as disclosed in this specification.
  • client agent 224 may include a special integrated circuit designed to carry out a method or a part thereof, and may also include software instructions operable to instruct a processor to perform the method.
  • client agent 224 may run as a “daemon” process.
  • a “daemon” may include any program or series of executable instructions, whether implemented in hardware, software, firmware, or any combination thereof, that runs as a background process, a terminate-and-stay-resident program, a service, system extension, control panel, bootup procedure, BIOS subroutine, or any similar program that operates without direct user interaction.
  • daemon processes may run with elevated privileges in a “driver space,” or in ring 0, 1, or 2 in a protection ring architecture.
  • client agent 224 may also include other hardware and software, including configuration files, registry entries, and interactive or user-mode software by way of non-limiting example.
  • client agent 224 includes executable instructions stored on a non-transitory medium operable to perform a method according to this specification.
  • processor 210 may retrieve a copy of the instructions from storage 250 and load it into memory 220 .
  • Processor 210 may then iteratively execute the instructions of client agent 224 to provide the desired method.
  • Client agent 224 may provide an interface and services for an end user, such as corporate user 120 or customer 160 to access services provided by a server 300 .
  • client agent 224 may be or comprise a web browser, and e-mail client, or a client for a program such as file transfer protocol (FTP), telnet, secure shell (SSH), network time protocol (NTP), or any other suitable network-based program.
  • FTP file transfer protocol
  • SSH secure shell
  • NTP network time protocol
  • Peripheral interface 240 may be configured to interface with any auxiliary device that connects to client device 200 but that is not necessarily a part of the core architecture of client device 200 .
  • a peripheral may be operable to provide extended functionality to client device 200 , and may or may not be wholly dependent on client device 200 .
  • a peripheral may be a computing device in its own right.
  • Peripherals may include input and output devices such as displays, terminals, printers, keyboards, mice, modems, data ports (e.g., serial, parallel, USB, Firewire, or similar), network controllers, optical media, external storage, sensors, transducers, actuators, controllers, data acquisition buses, cameras, microphones, speakers, or external storage by way of non-limiting example.
  • peripherals include display adapter 242 , audio driver 244 , and input/output (I/O) driver 246 .
  • Display adapter 242 may be configured to provide a human-readable visual output, such as a command-line interface (CLI) or graphical desktop such as Microsoft Windows, Apple OSX desktop, or a Unix/Linux X Window System-based desktop.
  • Display adapter 242 may provide output in any suitable format, such as a coaxial output, composite video, component video, VGA, or digital outputs such as DVI or HDMI, by way of nonlimiting example.
  • display adapter 242 may include a hardware graphics card, which may have its own memory and its own graphics processing unit (GPU).
  • GPU graphics processing unit
  • Audio driver 244 may provide an interface for audible sounds, and may include in some examples a hardware sound card. Sound output may be provided in analog (such as a 3.5 mm stereo jack), component (“RCA”) stereo, or in a digital audio format such as S/PDIF, AES3, AES47, HDMI, USB, Bluetooth or Wi-Fi audio, by way of non-limiting example.
  • analog such as a 3.5 mm stereo jack
  • RCA component
  • Digital audio format such as S/PDIF, AES3, AES47, HDMI, USB, Bluetooth or Wi-Fi audio, by way of non-limiting example.
  • FIG. 3 is a block diagram of a server-class device 300 according to one or more examples of the present specification.
  • Server 300 may be any suitable computing device or apparatus, as described in connection with FIG. 2 . In general, the definitions and examples of FIG. 2 may be considered as equally applicable to FIG. 3 , unless specifically stated otherwise.
  • Server 300 is described herein separately to illustrate that in certain embodiments, logical operations according to this specification may be divided along a client-server model, wherein client device 200 provides certain localized tasks, while server 300 provides certain other centralized tasks. In contemporary practice, server 300 is more likely than client device 200 to be provided as a “headless” VM running on a computing cluster, or as a standalone appliance, though these configurations are not required.
  • Server 300 includes processor 310 , which may include multiple cores, and which may be connected to a memory 320 , having stored therein executable instructions for providing an operating system 322 and at least software portions of a server engine 324 .
  • Other components of server 300 include a storage 350 , and network interface 360 . As described in FIG. 2 , each logical block may be provided by one or more similar or dissimilar logic elements.
  • processor 310 is communicatively coupled to memory 320 via memory bus 370 - 3 , which may be for example a direct memory access (DMA) bus or other suitable memory interconnect.
  • processor 310 may be communicatively coupled to other devices via a system bus 370 - 1 .
  • Processor 310 may be connected to memory 320 in a DMA configuration via DMA bus 370 - 3 , or via any other suitable memory configuration.
  • memory 320 may include one or more logic elements of any suitable type.
  • Storage 350 may be any species of memory 320 , or may be a separate device, as described in connection with storage 250 of FIG. 2 .
  • Storage 350 may be, or may include therein, a database or databases or data stored in other configurations, and may include a stored copy of operational software such as operating system 322 and software portions of server engine 324 .
  • Network interface 360 may be provided to communicatively couple server 140 to a wired or wireless network, and may include one or more logic elements as described in FIG. 2 .
  • Network interface 360 may be a NIC, HFI, or any other suitable device.
  • Server engine 324 is an engine as described in FIG. 2 and, in one example, includes one or more logic elements operable to carry out computer-implemented methods as described in this specification. Software portions of server engine 324 may run as a daemon process.
  • Server engine 324 may include one or more non-transitory computer-readable mediums having stored thereon executable instructions operable to instruct a processor to provide a security engine.
  • processor 310 may retrieve a copy of server engine 324 (or software portions thereof) from storage 350 and load it into memory 320 .
  • Processor 310 may then iteratively execute the instructions of server engine 324 to provide the desired method.
  • Server engine 324 may be any suitable server program, including a daemon, for providing a network-based service. These could include, by way of nonlimiting example, an HTTP or web server, an e-mail server, an FTP server, telnet server, SSH server, NTP server, or any other suitable server. In certain embodiments, server engine 324 is the “application” of ALNQ as described herein. In certain embodiments, server engine 324 may perform the method of FIG. 8 or related methods.
  • a queuing engine 326 may also be provided, and may be an engine as described herein. Queuing engine 326 may be configured to provide ALNQ queuing services, including performing the method of FIG. 7 or related methods.
  • FIG. 4 is a block diagram of selected elements of a server 300 according to one or more examples of the present specification.
  • NIC 360 connects server 300 to network 170 , which in its turn is communicatively coupled to a plurality of end users devices 462 .
  • End-user devices 462 access server 300 for the purpose of accessing services provided by server 300 , such as via server engine 324 or any other appropriate application.
  • server 300 includes eight cores 310 .
  • Each core 310 may be dedicated to a different service or thread of a service as appropriate, though that is a nonlimiting example only, and it is also possible for processors 310 to share different processes.
  • a web server 420 is provided. Web server 420 is divided into four separate threads. Specifically, web 0 420 - 0 runs on core 0 310 - 0 . Web 1 420 - 1 runs on core 1 310 - 1 . Web 2 420 - 2 runs on core 2 310 - 2 . Web 3 420 - 3 runs on core 3 310 - 3 .
  • An email server 430 is also provided.
  • two cores are dedicated to email server 430 .
  • email 0 430 - 0 runs on core 4 310 - 4 .
  • Email 1 430 - 1 runs on core 5 310 - 5 .
  • the last two cores are committed to support background processes. Specifically, support applications 450 run on core 6 310 - 6 . Operating system 322 runs on core 7 310 - 7 . It should be understood that the number of cores, the number of threads, and the distribution of threads among the several cores is provided by way of nonlimiting example. In other embodiments, other configurations are possible. In some cases, a machine may have many more cores, and many more threads. In other cases, a machine may have fewer cores and fewer threads. Any suitable combination of cores and threads is available.
  • NIC 360 exposes queues 450 numbered 0 through 15 .
  • queues 0 through 3 comprising group 452
  • queues 4 and 5 comprising group 454
  • Queues 6 through 15 are considered auxiliary queues, and are allocated for use by operating system 322 and support application 450 .
  • queuing engine 326 inspects the packet in a top-down fashion. In other words, queuing engine 326 inspect the upper layer information first. In this example, queuing engine 326 is specifically looking at which logical port the packet is designated for. In an example, a packet is assigned to group 452 if and only if it is port 80 traffic. Thus, all port 80 traffic is assigned to group 452 , and no traffic that is not port 80 traffic is assigned to group 452 .
  • the traffic is assigned to group 454 .
  • all port 25 traffic is assigned to group 454
  • no traffic that is not group that is not port 25 traffic is assigned to group 454 .
  • Any traffic that is not either port 80 or port 25 is assigned to one of the auxiliary ports, numbered 6 through 15 , according to standard assignment methods. This may include hashing or similar. This means that web server 420 knows that all of its traffic will appear in group 452 , and that no outside traffic will appear in the group. Similarly, email server 430 knows that all of its traffic will appear in group 454 , and that no outside traffic will appear in the group. Conversely, operating system 322 knows that those two groups are completely “owned” by their respective applications, so OS 322 can in appropriate circumstances circumvent some of its usual security procedures and let those applications handle their own internal security.
  • Assignment of traffic to auxiliary ports 6 through 15 may be accomplished with any suitable method, including in some embodiments prior art methods such as RSS, Flow Director, or similar. Similarly, once a packet has been assigned to either group 452 or 454 , assignment of the packet to queues within the group may be accomplished by any suitable means, including RSS, Flow Director, or equivalent.
  • group 453 includes only traffic on port 80 designated for web server 420 .
  • the operating system now has the flexibility to impose application specific policies and optimizations directly. As an example it could pass the queues directly to the application without having to consider security violations from having traffic from other applications.
  • FIG. 5 is a view of an additional embodiment according to one or more examples of the present specification.
  • groups 452 and 454 are twice as large as they are in the example of FIG. 4 . This is to illustrate that while one-to-one mapping between queues and threads is possible, n-to-one mapping is also possible, for any n greater than 1.
  • groups 452 and 454 both reserve two separate queues for each thread.
  • queues 0 and 1 are reserved for web server thread 420 - 0 .
  • Queues 2 and 3 are for web server thread 420 - 1 .
  • Queues 4 and 5 are for web server thread 420 - 2 .
  • Queues 6 and 7 are for web server thread 420 - 3 .
  • queues 8 and 9 are for email server thread 430 - 0 .
  • Queues 10 and 11 are for email server thread 430 - 1 .
  • queues 12 through 15 are allocated for general purposes, and may be used for support function 450 and operating system 322 , including all traffic not on port 80 or 25 .
  • FIG. 4 and FIG. 5 are a nonlimiting example of a potential configuration. Other configurations are possible, including a configuration in which the number of queues exactly matches the number of cores, or where the number of queues is related to the number of cores by a non-integer function.
  • FIG. 6 is a block diagram illustration of a traditional OSI network stack model.
  • the network stack includes seven layers, specifically physical layer 610 , data link layer 620 , network layer 630 , transport layer 640 , session layer 650 , presentation layer 660 , and application layer 670 .
  • the OSI model is presented here for purposes of illustration, but it should be understood that the OSI model is a nonlimiting example only.
  • the “internet protocol suite” (often referred to as TCP/IP) is also commonly used. TCP/IP layers may be similar but not identical to OSI layers.
  • incoming traffic to a network interface such as NIC 360 traffic is assigned to queues such as queues 450 .
  • FIG. 7 is a flowchart of a method 700 performed by queuing engine 326 , particularly for received packets, according to one or more examples of the present specification.
  • NIC 360 receives an incoming packet on. This means that the physical interface receives electrical signals providing an incoming data packet.
  • queuing engine 326 parses the packet in a top-down manner. Specifically, in this example, queuing engine 326 extracts the destination port for the packet, though other values may be used.
  • queuing engine 326 determines whether the port is configured a “listening” port. If so, it identifies a number of queues in the group allocated for this port.
  • queuing engine 326 load balances or distributes the packet to its assigned queue. For example, if group 452 has been designated for HTTP traffic, and if the incoming packet is on port 80 , then queuing engine 326 may assign the packet to one of the cues in group 452 according to an appropriate load-balancing algorithm. RSS, Flow Director, or similar may be used to identify the specific queue to assign the packet into. The packet may then be made available via direct memory access (DMA), so that it can be accessed independently of the processor where appropriate.
  • DMA direct memory access
  • queuing engine 326 sends an interrupt to the host.
  • the interrupt driver does not trigger an interrupt with every incoming packet. Rather, interrupts are triggered only when a threshold of packets or time has been reached.
  • An interrupt service routine may then signal server engine 324 directly. Interrupts are asserted depending on the triggering logic, and if asserted signal the application. In either case, polling from server engine 324 may trigger the packet flow.
  • queuing engine 326 sends a queuing notification to the appropriate process, such as a thread of server engine 324 .
  • queuing engine 326 may send a notification to the appropriate thread such as thread 0 420 - 0 of the process.
  • the driver may pick the packet from the queue, and pass along with the packet, opaque information that identified the specific queue. This information is stored in the sockets layer and available to the application.
  • FIG. 8 is a flowchart of a method 800 performed by an application, such as server engine 324 , according to one or more examples of the present specification.
  • the application thread makes and asynchronous poll call.
  • the Linux operating system provides a call known as EPOLL.
  • EPOLL a call known as EPOLL.
  • the application gets the opaque queue identifier for the socket it is watching for and checks to see if that it matches the queue identifier for other sockets it is watching for a particular event file descriptor. If so, it adds this socket to the watch list. Otherwise, it hands the socket to the event file descriptor that has the matching queue identifier and is added to its watch list.
  • a single thread operates on a single event file descriptor. Thus, creating the match of watching sockets with the same queue identifier streamlines the traffic flow without interference from other applications.
  • the application thread polls its assigned queues according to the socket/event file descriptors provided by asynchronous polling. If the thread was asleep because there is no data (and consequently, no work for it do), a wakeup may be performed directly on the sleeping thread, allowing the thread to come back and pull packets from its queues. In other words, protocol and descriptor processing occur in the context of the thread, not in the context of the interrupt, or deferred procedure call (DPC) (Windows)/SoftIRQ (Linux), which is scheduled as a result of an interrupt. In doing so, scheduling affinities are preserved.
  • the scheduler wakeup of the application (such as server engine 326 ) keeps all shared kernel data structures associated with packet delivery and transmission in the same caching/scheduling domain, aligning with the scheduling policies of an operating system/hypervisor.
  • Busy polling sockets may be streamlined to pull packets directly from the application-specific queues.
  • decision block 840 if data are available, then in block 870 , method 900 of FIG. 9 is performed.
  • FIG. 9 is a flowchart of a method 900 disclosing additional details of data consumption according to one or more examples of the present specification.
  • NIC 360 stamps a queue identifier token associated with a packet, and propagates the identifier up the network stack.
  • Application 324 then receives the queue identifier from NIC 360 . Note that if there is more than one NIC 360 , this may also include a device identifier (for example, NIC 0, queue 1; NIC 1, queue 1; etc).
  • the socket layer stores the queue identifier as part of its state.
  • the queue identifier may be device specific.
  • application 324 consumes the data.
  • FIG. 10 is a flow chart of a method 1000 of performing a transmit operation with ALNQ according to one or more examples of the present specification. This method is understood as a nonlimiting example, and in particular, specific network layers are assigned operations in this flow chart by way of illustration.
  • application 324 may set queue mapping for the socket, such as via extended socket options. Note that this may be necessary only if the send-to queue is different from the last transmit operation.
  • application 324 sends the packet using the normal socket SEND call.
  • the socket layer of the network stack appends the queue identifier to the sent packet.
  • the protocol and network layers perform their normal, known functions.
  • the device layer checks the queue identifier and places the packet on a specified queue for transmission.
  • the device layer signals the appropriate NIC 360 that the packet is ready for transmission.
  • NIC 360 transmits the packet.
  • client device 200 and server 300 may both be examples of computing apparatuses or other computing devices.
  • SoC system-on-a-chip
  • CPU central processing unit
  • An SoC represents an integrated circuit (IC) that integrates components of a computer or other electronic system into a single chip.
  • client devices 110 or server devices 300 may be provided, in whole or in part, in an SoC.
  • the SoC may contain digital, analog, mixed-signal, and radio frequency functions, all of which may be provided on a single chip substrate.
  • Other embodiments may include a multi-chip-module (MCM), with a plurality of chips located within a single electronic package and configured to interact closely with each other through the electronic package.
  • MCM multi-chip-module
  • the computing functionalities disclosed herein may be implemented in one or more silicon cores in Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), and other semiconductor chips.
  • ASICs Application Specific Integrated Circuits
  • FPGAs Field Programmable Gate Arrays
  • any suitably-configured processor such as processor 310
  • Any processor disclosed herein could transform an element or an article (for example, data) from one state or thing to another state or thing.
  • some activities outlined herein may be implemented with fixed logic or programmable logic (for example, software and/or computer instructions executed by a processor) and the elements identified herein could be some type of a programmable processor, programmable digital logic (for example, a field programmable gate array (FPGA), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM)), an ASIC that includes digital logic, software, code, electronic instructions, flash memory, optical disks, CD-ROMs, DVD ROMs, magnetic or optical cards, other types of machine-readable mediums suitable for storing electronic instructions, or any suitable combination thereof.
  • FPGA field programmable gate array
  • EPROM erasable programmable read only memory
  • EEPROM electrically erasable programmable read only memory
  • ASIC that includes digital logic, software, code, electronic instructions, flash memory, optical disks, CD-ROMs, DVD ROMs, magnetic or optical cards, other types of machine-readable mediums suitable for storing
  • a storage such as storage 350 may store information in any suitable type of tangible, non-transitory storage medium (for example, random access memory (RAM), read only memory (ROM), field programmable gate array (FPGA), erasable programmable read only memory (EPROM), electrically erasable programmable ROM (EEPROM), etc.), software, hardware (for example, processor instructions or microcode), or in any other suitable component, device, element, or object where appropriate and based on particular needs.
  • RAM random access memory
  • ROM read only memory
  • FPGA field programmable gate array
  • EPROM erasable programmable read only memory
  • EEPROM electrically erasable programmable ROM
  • the information being tracked, sent, received, or stored in a processor could be provided in any database, register, table, cache, queue, control list, or storage structure, based on particular needs and implementations, all of which could be referenced in any suitable timeframe.
  • a non-transitory storage medium herein is expressly intended to include any non-transitory special-purpose or programmable hardware configured to provide the disclosed operations, or to cause a processor such as processor 310 to perform the disclosed operations.
  • Computer program logic implementing all or part of the functionality described herein is embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, machine instructions or microcode, programmable hardware, and various intermediate forms (for example, forms generated by an assembler, compiler, linker, or locator).
  • source code includes a series of computer program instructions implemented in various programming languages, such as an object code, an assembly language, or a high-level language such as OpenCL, Fortran, C, C++, JAVA, or HTML for use with various operating systems or operating environments, or in hardware description languages such as Spice, Verilog, and VHDL.
  • the source code may define and use various data structures and communication messages.
  • the source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form, or converted to an intermediate form such as byte code.
  • any of the foregoing may be used to build or describe appropriate discrete or integrated circuits, whether sequential, combinatorial, state machines, or otherwise.
  • any number of electrical circuits of the FIGURES may be implemented on a board of an associated electronic device.
  • the board can be a general circuit board that can hold various components of the internal electronic system of the electronic device and, further, provide connectors for other peripherals. More specifically, the board can provide the electrical connections by which the other components of the system can communicate electrically.
  • Any suitable processor and memory can be suitably coupled to the board based on particular configuration needs, processing demands, and computing designs.
  • Other components such as external storage, additional sensors, controllers for audio/video display, and peripheral devices may be attached to the board as plug-in cards, via cables, or integrated into the board itself.
  • the electrical circuits of the FIGURES may be implemented as stand-alone modules (e.g., a device with associated components and circuitry configured to perform a specific application or function) or implemented as plug-in modules into application specific hardware of electronic devices.
  • a network interface card comprising: an ingress interface to receive incoming traffic; a plurality of queues to queue incoming traffic; an egress interface to direct incoming traffic to a plurality of server applications; and a queuing engine, including logic to: uniquely associate a queue with a selected server application; receive an incoming network packet; determine that the selected server application may process the incoming network packet; and assign the incoming network packet to the queue.
  • determining that the selected server application may process the incoming network packet comprises inspecting an application identifier of the incoming network packet.
  • NIC wherein the application identifier comprises a port number.
  • NIC further comprising means for notifying the selected server application of the assigning.
  • NIC further comprising means to enable the selected server application to bypass an interrupt mechanism and to perform per-thread, non-interfering busy polling.
  • NIC wherein the queuing engine is further to receive a queue mapping for a socket via an extended socket option.
  • the queueing engine further includes logic to uniquely associate a plurality of queues with the selected server application.
  • the queueing engine further includes logic to uniquely assign a second queue to a second selected server application.
  • the queueing engine further includes logic to assign a support queue to handle incoming packets not directed to a queue uniquely assigned to a server application.
  • the queueing engine further includes logic to assign a plurality of support queues.
  • NIC wherein the plurality of support queues number less than a sum of queues uniquely assigned to one or more selected server applications.
  • a tangible, non-transitory storage medium having stored thereon instructions to instruct a network interface card (NIC) to: deterministically associate an ingress queue of the NIC with a server function to be serviced by the NIC; receive an incoming network packet on an ingress interface of the NIC; determine that the server function may process the incoming network packet; and assign the incoming network packet to the queue.
  • NIC network interface card
  • determining that the selected server application may process the incoming network packet comprises inspecting an application identifier of the incoming network packet.
  • queueing engine further includes logic to uniquely associate a plurality of queues with the selected server application.
  • the queueing engine further includes logic to uniquely assign a second queue to a second selected server application.
  • the queueing engine further includes logic to assign a support queue to handle incoming packets not directed to a queue uniquely assigned to a server application.
  • an example method of providing deterministic per-application packet queueing comprising: uniquely associating one or more ingress queues of a network interface card (NIC) with a server function to be serviced by the NIC; receiving an incoming network packet on an ingress interface of the NIC; determining that the network packet is directed to the associated server function; and assigning the incoming network packet to a queue selected from the one or more uniquely associated queues.
  • NIC network interface card
  • determining that the network packet is directed to the associated server function comprises inspecting an application identifier of the incoming network packet.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)

Abstract

There is disclosed in one example a network interface card (NIC), comprising: an ingress interface to receive incoming traffic; a plurality of queues to queue incoming traffic; an egress interface to direct incoming traffic to a plurality of server applications; and a queuing engine, including logic to: uniquely associate a queue with a selected server application; receive an incoming network packet; determine that the selected server application may process the incoming network packet; and assign the incoming network packet to the queue.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims priority to U.S. Non-Provisional application Ser. No. 14/998,138 filed 26 Dec. 2015, titled “Application-Level Network Queueing.” The above related application is incorporated herein by reference in its entirety.
  • FIELD OF THE SPECIFICATION
  • This disclosure relates in general to the field of computer networking, and more particularly, though not exclusively to, a system and method for application-level network queueing.
  • BACKGROUND
  • Busy polling, also known as “busy waiting” or “spinning,” is a technique in which a process repeatedly monitors a condition to see if it has become true. For example, if the process needs input, it may continually loop, watching an input buffer to see if data have become available. Other uses may be to see if a key has been pressed, or if a lock is available. Another use is to create a known time delay, such as by “doing nothing” for a specified number of milliseconds, though variations in processor speed may make this not completely reliable.
  • Busy polling may be a useful technique particularly in low-level programming, where interrupt-driven processing may be inefficient or cumbersome.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present disclosure is best understood from the following detailed description when read with the accompanying figures. It is emphasized that, in accordance with the standard practice in the industry, various features are not necessarily drawn to scale, and are used for illustration purposes only. Where a scale is shown, explicitly or implicitly, it provides only one illustrative example. In other embodiments, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion.
  • FIG. 1 is a block diagram of a security-enabled network according to one or more examples of the present specification.
  • FIG. 2 is a block diagram of a client device according to one or more examples of the present specification.
  • FIG. 3 is a block diagram of a server device according to one or more examples of the present specification.
  • FIG. 4 is a block diagram of a network queueing system according to one or more examples of the present specification.
  • FIG. 5 is a block diagram of a network queueing system according to one or more examples of the present specification.
  • FIG. 6 is a block diagram of an Open Systems Interconnection model (OSI) seven-layer stack according to one or more examples of the present specification.
  • FIG. 7 is a flow chart of a method performed by a network device according to one or more examples of the present specification.
  • FIG. 8 is a flow chart of a method performed by an application according to one or more examples of the present specification
  • FIG. 9 is a flow chart of a method performed by an application according to one or more examples of the present specification.
  • FIG. 10 is a flow chart of a transmit operation according to one or more examples of the present specification.
  • SUMMARY
  • In an example, there is disclosed a network interface card (NIC), comprising: an ingress interface to receive incoming traffic; a plurality of queues to queue incoming traffic; an egress interface to direct incoming traffic to a plurality of server applications; and a queuing engine, including logic to: uniquely associate a queue with a selected server application; receive an incoming network packet; determine that the selected server application may process the incoming network packet; and assign the incoming network packet to the queue.
  • Embodiments of the Disclosure
  • The following disclosure provides many different embodiments, or examples, for implementing different features of the present disclosure. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. Further, the present disclosure may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. Different embodiments may have different advantages, and no particular advantage is necessarily required of any embodiment.
  • By way of example, modern day data center tiers may include web servers and data caching servers, with “Big Data” backends like the “Hadoop” software framework. These may be networked using leaf spine architectures. In one example, these tiers are distributed systems in and of themselves and typically service many thousands of concurrent connections, with the number of connections expected to increase in the future as the number of cores on the end nodes increase.
  • Some data centers may have a target performance determinism, such as to meet contractual requirements or as a competitive differentiator. Providing the desired performance determinism (e.g., 99.99 percentile latency, with consistent throughput and CPU utilization) for distributed systems applications in this environment may be a challenge. In many cases, the networking interface is one of the primary contributors to this issue, as it can become a bottleneck to application performance on a busy, high-end server.
  • This specification describes a system and method for application-level network queuing (ALNQ), which address network interface bottleneck in a scalable manner. In an embodiment of ALNQ, an improved network interface card (NIC) is provided, and adjustments may also be made to the existing network stack, such as the Open Systems Interconnection (OSI) or transmission control protocol/internet protocol (TCP/IP) stack. Alternatively, in some embodiments, the host network stack is bypassed in whole or in part, and parallel offloaded or onloaded stacks may be run.
  • Many NICs divide incoming traffic into a plurality of separate “queues,” with traffic being load balanced to queues based on, for example, a hashing algorithm based on a socket tuple that includes the source media access control address (MAC), destination MAC, and/or protocol. ALNQ addresses issues, such as that applications may desire to have their own “visible” queues and load balancing control using the operating system's networking stack. ALNQ makes this possible by creating scalable application specific queues on NICs with socket layer visibility and an affinitization and wakeup algorithm applied to kernel event multiplexing interfaces like “EPOLL” and “KQUEUE.” In selected embodiments, this leads to lower and deterministic average and tail latencies, and, higher throughput within latency tolerance boundaries, which may help in scaling distributed systems.
  • In an embodiment, ALNQ takes a top-down view with respect to steering traffic to an application and delivering low latency communication. Applications, from a networking perspective, may distinguish themselves within a system by virtue of a port number. For example, an HTTP daemon may listen on port 80, while an e-mail daemon may listen on port 25. ALNQ may use the port number (along with other relevant identifying information) as the primary key for assigning traffic to a queue, i.e. application specific queues. Traffic into these queues may be multiplexed via existing schemes like “Receive Side Scaling” (RSS) for hash-based load balancing, or Intel® Ethernet Flow Director for determinate matching.
  • Information about the assigned queue may be passed up to the application as an opaque identifier. The application may then use this identifier for alignment hints. For example, an application that uses EPOLL may align application thread handling the file descriptors to the queue, thus streamlining access and mitigating effects from cache line bouncing of socket and protocol related data structures. Interrupts may also be tuned to better match application behavior, as these queues can be distributed between applications, rather than between cores, which may cause them to be shared between applications. Thus, low latency as an attribute can be applied to an entire application, and therefore a queue set, rather than for a socket (or flow) that is open. This separation also allows for direct application signaling schemes for wakeup before protocol processing is performed.
  • Advantageously, ALNQ steers NIC traffic by assigning queues not per interface or virtual socket interface (VSI), or by VMs or traffic classes, but rather per port number, with the port number functioning as an application identifier. From a networking perspective, a port number may operate as a useful primitive for identifying an application on a particular machine. Note that the application identifier could be any suitable application identifier other than the port number.
  • Further advantageously, ALNQ can load balance the queues assigned to an application/port number using known hash-based schemes like RSS or exact match schemes like Flow Director
  • Further advantageously, ALNQ may pass queue information as a token to the application, such as via an enhancement to the sockets interface options.
  • Further advantageously, applications may align threads to queues using these tokens.
  • Further advantageously, separating application-specific traffic early provides a range of affinitization choices, such as using existing low-latency methods such as Busy Poll Sockets (BPS), providing scalable aligning of queues to application threads, or applying quality of service or security policies.
  • Further advantageously, ALNQ enables an interrupt signaling scheme wherein interrupts signal a sleeping application thread directly, which may then pull data out from device queues.
  • A system and method for application level network queueing will now be described with more particular reference to the attached FIGURES. It should be noted that throughout the FIGURES, certain reference numerals may be repeated to indicate that a particular device or block is wholly or substantially consistent across the FIGURES. This is not, however, intended to imply any particular relationship between the various embodiments disclosed. In certain examples, a genus of elements may be referred to by a particular reference numeral (“widget 10”), while individual species or examples of the genus may be referred to by a hyphenated numeral (“first specific widget 10-1” and “second specific widget 10-2”).
  • FIG. 1 is a network-level diagram of a secured enterprise 100 according to one or more examples of the present specification. In this example, secured enterprise 100 may be configured to provide services or data to one or more customers 162, who may access information or services via external network 172. This may require secured enterprise 100 to at least partly expose certain services and networks to the outside world, thus creating a logical security aperture.
  • Within secured enterprise, one or more users 120 operate one or more client devices 110. Each device may include an appropriate operating system, such as Microsoft Windows, Linux, Android, Mac OSX, Apple iOS, Unix, or similar. Some of the foregoing may be more often used on one type of device than another. For example, desktop computers or engineering workstation may be more likely to use one of Microsoft Windows, Linux, Unix, or Mac OSX. Laptop computers, which are usually a portable off-the-shelf device with fewer customization options, may be more likely to run Microsoft Windows or Mac OSX. Mobile devices may be more likely to run Android or iOS. However, these examples are not intended to be limiting.
  • Client devices 110 may be communicatively coupled to one another and to other network resources via enterprise network 170. Enterprise network 170 may be any suitable network or combination of one or more networks operating on one or more suitable networking protocols, including for example, a local area network, an intranet, a virtual network, a wide area network, a wireless network, a cellular network, or the Internet (optionally accessed via a proxy, virtual machine, or other similar security mechanism) by way of nonlimiting example. Enterprise network 170 may also include one or more servers, firewalls, routers, switches, security appliances, antivirus servers, or other useful network devices, which in an example may be virtualized within workload cluster 142. In this illustration, enterprise network 170 is shown as a single network for simplicity, but in some embodiments, enterprise network 170 may include a large number of networks, such as one or more enterprise intranets connected to the internet. Enterprise network 170 may also provide access to an external network, such as the Internet, via external network 172. External network 172 may similarly be any suitable type of network.
  • A workload cluster 142 may be provided, for example as a virtual cluster running in a hypervisor on a plurality of rack-mounted blade servers, or as a cluster of physical servers. Workload cluster 142 may provide one or more server functions, or one or more “microclouds” in one or more hypervisors. For example, a virtualization environment such as vCenter may provide the ability to define a plurality of “tenants,” with each tenant being functionally separate from each other tenant, and each tenant operating as a single-purpose microcloud. Each microcloud may serve a distinctive function, and may include a plurality of virtual machines (VMs) of many different flavors, including agentful and agentless VMs.
  • It should be noted that some functionality of endpoint devices 110 may also be provided via workload cluster 142. For example, one microcloud may provide a remote desktop hypervisor such as a Citrix workspace, which allows users 120 operating endpoints 110 to remotely login to a remote enterprise desktop and access enterprise applications, workspaces, and data. In that case, endpoint 110 could be a “thin client” such as a Google Chromebook, running only a stripped-down operating system, and still provide user 120 useful access to enterprise resources.
  • One or more computing devices configured as a management console 140 may also operate on enterprise network 170. Management console 140 may provide a user interface for a security administrator 150 to define enterprise security policies, which management console 140 may enforce on enterprise network 170 and across client devices 110 and workload cluster 142. In an example, management console 140 may run a server-class operating system, such as Linux, Unix, or Windows Server. In other case, management console 140 may be provided as a web interface, on a desktop-class machine, or via a VM provisioned within workload cluster 142.
  • Secured enterprise 100 may communicate across enterprise boundary 104 with external network 172. Enterprise boundary 104 may represent a physical, logical, or other boundary. External network 172 may include, for example, websites, servers, network protocols, and other network-based services. In another example, secured enterprise 100 may simply be a family, with parents assuming the role of security administrator 150. The parents may wish to protect their children from undesirable content, such as pornography, adware, spyware, age-inappropriate content, advocacy for certain political, religious, or social movements, or forums for discussing illegal or dangerous activities, by way of non-limiting example. In this case, the parent may perform some or all of the duties of security administrator 150.
  • Workload cluster 142 may be configured to provide any suitable network services to client devices, such as client device 110 and end user device 180. For example, workload cluster 142 may be a rack-mount cloud system providing security and infrastructure services such as a firewall, antivirus, deep packet inspection, load balancing, and switching by way of nonlimiting example. Workload cluster 142 may also provide substantive network services, such as a corporate file server, database file system, intranet server, or a database server by way of nonlimiting example. An enterprise user 120 may use a client application 112 on client device 110 to access services on workload cluster 142 via enterprise network 170.
  • End user 160 may also access network resources. End user 160 may operate an end user application 162 on end user device 180, and access workload cluster 142 via external network 172 and enterprise network 170. In accordance with good corporate security practice, network from outside of secured enterprise 100 may be segregated from corporate network traffic, such as by placing enterprise traffic and external traffic on different subnetworks.
  • Workload cluster 142 may provide to external end users 160 various suitable network services, such as a website over hypertext transfer protocol (HTTP), e-mail over simple mail transport protocol (SMTP), file sharing via file transfer protocol (FTP), secure FTP (SFTP), or virtual shells via protocols such as telnet or secure shell (SSH).
  • Note that while end user 160 and corporate user 120 are disclosed here as two separate entities by way of nonlimiting example, strict segregation is not required. An end user 160 may also be a corporate user, and may access corporate resources via a remote client such as Citrix or a remote desktop protocol (RDP). Corporate user 120 may also be a guest user with less than full access to enterprise resources.
  • To provide a simplified example by way of illustration, this specification will assume that workload cluster 142 is a single workload server 142 having one or more network interface cards (NICs). However, the methods disclosed herein can be scaled to any appropriate number of servers configured in any appropriate manner, such as standalone servers, service appliances, individual servers operating multiple protocols, or a plurality of virtual servers operating on one or a plurality of physical devices.
  • In this example, workload server 142 hosts both an HTTP web server receiving traffic on port 80, and an SMTP mail server receiving traffic on port 25. Workload server 142 may open a unique communication “socket” or other logical connection for each connection with a client device. Thus, for example, if 100,000 users, either corporate users 120 or end users 160, or any combination thereof, are accessing a website at a given time, then 100,000 HTTP sockets will be open on port 80. Similarly, if 50,000 e-mail users are accessing e-mail services, then 50,000 SMTP sockets will be open on port 25. These quantities are provided for illustration only. Smaller or independent servers may have far fewer simultaneous sockets open, while large, popular web services may have many more sockets open.
  • As illustrated in this specification, with a large number of concurrent connections, queueing can be a substantial challenge. Certain embodiments of traditional “bottom-up” hash-based queueing such as RSS, in which a packet is hashed starting with the lower levels of the OSI network model, may result in substantial latency due to traffic from other applications sharing the same queues. Thus, this specification provides a system and method for providing application-level network queuing to reduce latency and improve performance. Example logical embodiments are illustrated in FIGS. 4 and 5, while example methods are illustrated in FIGS. 7-9.
  • FIG. 2 is a block diagram of client device 200 according to one or more examples of the present specification. Client device 200 may be any suitable computing device. In various embodiments, a “computing device” may be or comprise, by way of non-limiting example, a computer, workstation, server, mainframe, virtual machine (whether emulated or on a “bare-metal” hypervisor), embedded computer, embedded controller, embedded sensor, personal digital assistant, laptop computer, cellular telephone, IP telephone, smart phone, tablet computer, convertible tablet computer, computing appliance, network appliance, receiver, wearable computer, handheld calculator, or any other electronic, microelectronic, or microelectromechanical device or apparatus for processing and communicating data. Any computing device may be designated as a host on the network. Each computing device may refer to itself as a “local host,” while any computing device external to it may be designated as a “remote host.”
  • In certain embodiments, client devices 110 may all be examples of client devices 200.
  • Client device 200 includes a processor 210 connected to a memory 220, having stored therein executable instructions for providing an operating system 222 and at least software portions of a client agent 224. Other components of client device 200 include a storage 250, network interface 260, and peripheral interface 240. This architecture is provided by way of example only, and is intended to be non-exclusive and non-limiting. Furthermore, the various parts disclosed are intended to be logical divisions only, and need not necessarily represent physically separate hardware and/or software components. Certain computing devices provide main memory 220 and storage 250, for example, in a single physical memory device, and in other cases, memory 220 and/or storage 250 are functionally distributed across many physical devices. In the case of virtual machines or hypervisors, all or part of a function may be provided in the form of software or firmware running over a virtualization layer to provide the disclosed logical function. In other examples, a device such as a network interface 260 may provide only the minimum hardware interfaces necessary to perform its logical operation, and may rely on a software driver to provide additional necessary logic. Thus, each logical block disclosed herein is broadly intended to include one or more logic elements configured and operable for providing the disclosed logical operation of that block. As used throughout this specification, “logic elements” may include hardware, external hardware (digital, analog, or mixed-signal), software, reciprocating software, services, drivers, interfaces, components, modules, algorithms, sensors, components, firmware, microcode, programmable logic, or objects that can coordinate to achieve a logical operation.
  • In an example, processor 210 is communicatively coupled to memory 220 via memory bus 270-3, which may be for example a direct memory access (DMA) bus by way of example, though other memory architectures are possible, including ones in which memory 220 communicates with processor 210 via system bus 270-1 or some other bus. Processor 210 may be communicatively coupled to other devices via a system bus 270-1. As used throughout this specification, a “bus” includes any wired or wireless interconnection line, network, connection, bundle, single bus, multiple buses, crossbar network, single-stage network, multistage network or other conduction medium operable to carry data, signals, or power between parts of a computing device, or between computing devices. It should be noted that these uses are disclosed by way of non-limiting example only, and that some embodiments may omit one or more of the foregoing buses, while others may employ additional or different buses.
  • In various examples, a “processor” may include any combination of logic elements operable to execute instructions, whether loaded from memory, or implemented directly in hardware, including by way of non-limiting example a microprocessor, digital signal processor, field-programmable gate array, graphics processing unit, programmable logic array, application-specific integrated circuit, or virtual machine processor. In certain architectures, a multi-core processor may be provided, in which case processor 210 may be treated as only one core of a multi-core processor, or may be treated as the entire multi-core processor, as appropriate. In some embodiments, one or more co-processor may also be provided for specialized or support functions.
  • Processor 210 may be connected to memory 220 in a DMA configuration via DMA bus 270-3 (or other similar memory interconnect). To simplify this disclosure, memory 220 is disclosed as a single logical block, but in a physical embodiment may include one or more blocks of any suitable volatile or non-volatile memory technology or technologies, including for example DDR RAM, SRAM, DRAM, cache, L1 or L2 memory, on-chip memory, registers, flash, ROM, optical media, virtual memory regions, magnetic or tape memory, or similar. In certain embodiments, memory 220 may comprise a relatively low-latency volatile main memory, while storage 250 may comprise a relatively higher-latency non-volatile memory. However, memory 220 and storage 250 need not be physically separate devices, and in some examples may represent simply a logical separation of function. It should also be noted that although DMA is disclosed by way of non-limiting example, DMA is not the only protocol consistent with this specification, and that other memory architectures are available.
  • Storage 250 may be any species of memory 220, or may be a separate device. Storage 250 may include one or more non-transitory computer-readable mediums, including by way of non-limiting example, a hard drive, solid-state drive, external storage, redundant array of independent disks (RAID), network-attached storage, optical storage, tape drive, backup system, cloud storage, or any combination of the foregoing. Storage 250 may be, or may include therein, a database or databases or data stored in other configurations, and may include a stored copy of operational software such as operating system 222 and software portions of client agent 224. Many other configurations are also possible, and are intended to be encompassed within the broad scope of this specification.
  • Network interface 260 may be provided to communicatively couple client device 200 to a wired or wireless network, and may be a network interface card (NIC), host fabric interface (HFI), or any other suitable device. A “network,” as used throughout this specification, may include any communicative platform operable to exchange data or information within or between computing devices, including by way of non-limiting example, an ad-hoc local network, an internet architecture providing computing devices with the ability to electronically interact, a plain old telephone system (POTS), which computing devices could use to perform transactions in which they may be assisted by human operators or in which they may manually key data into a telephone or other suitable electronic equipment, any packet data network (PDN) offering a communications interface or exchange between any two nodes in a system, or any local area network (LAN), metropolitan area network (MAN), wide area network (WAN), wireless local area network (WLAN), virtual private network (VPN), intranet, or any other appropriate architecture or system that facilitates communications in a network or telephonic environment.
  • Client agent 224, in one example, is operable to carry out computer-implemented methods as described in this specification. Client agent 224 may include one or more tangible non-transitory computer-readable mediums having stored thereon executable instructions operable to instruct a processor to provide a client agent 224. As used throughout this specification, an “engine” includes any combination of one or more logic elements, of similar or dissimilar species, operable for and configured to perform one or more methods provided by the engine. Thus, client agent 224 may comprise one or more logic elements configured to provide methods as disclosed in this specification. In some cases, client agent 224 may include a special integrated circuit designed to carry out a method or a part thereof, and may also include software instructions operable to instruct a processor to perform the method. In some cases, client agent 224 may run as a “daemon” process. A “daemon” may include any program or series of executable instructions, whether implemented in hardware, software, firmware, or any combination thereof, that runs as a background process, a terminate-and-stay-resident program, a service, system extension, control panel, bootup procedure, BIOS subroutine, or any similar program that operates without direct user interaction. In certain embodiments, daemon processes may run with elevated privileges in a “driver space,” or in ring 0, 1, or 2 in a protection ring architecture. It should also be noted that client agent 224 may also include other hardware and software, including configuration files, registry entries, and interactive or user-mode software by way of non-limiting example.
  • In one example, client agent 224 includes executable instructions stored on a non-transitory medium operable to perform a method according to this specification. At an appropriate time, such as upon booting client device 200 or upon a command from operating system 222 or a user 120, processor 210 may retrieve a copy of the instructions from storage 250 and load it into memory 220. Processor 210 may then iteratively execute the instructions of client agent 224 to provide the desired method.
  • Client agent 224 may provide an interface and services for an end user, such as corporate user 120 or customer 160 to access services provided by a server 300. For example, client agent 224 may be or comprise a web browser, and e-mail client, or a client for a program such as file transfer protocol (FTP), telnet, secure shell (SSH), network time protocol (NTP), or any other suitable network-based program.
  • Peripheral interface 240 may be configured to interface with any auxiliary device that connects to client device 200 but that is not necessarily a part of the core architecture of client device 200. A peripheral may be operable to provide extended functionality to client device 200, and may or may not be wholly dependent on client device 200. In some cases, a peripheral may be a computing device in its own right. Peripherals may include input and output devices such as displays, terminals, printers, keyboards, mice, modems, data ports (e.g., serial, parallel, USB, Firewire, or similar), network controllers, optical media, external storage, sensors, transducers, actuators, controllers, data acquisition buses, cameras, microphones, speakers, or external storage by way of non-limiting example.
  • In one example, peripherals include display adapter 242, audio driver 244, and input/output (I/O) driver 246. Display adapter 242 may be configured to provide a human-readable visual output, such as a command-line interface (CLI) or graphical desktop such as Microsoft Windows, Apple OSX desktop, or a Unix/Linux X Window System-based desktop. Display adapter 242 may provide output in any suitable format, such as a coaxial output, composite video, component video, VGA, or digital outputs such as DVI or HDMI, by way of nonlimiting example. In some examples, display adapter 242 may include a hardware graphics card, which may have its own memory and its own graphics processing unit (GPU). Audio driver 244 may provide an interface for audible sounds, and may include in some examples a hardware sound card. Sound output may be provided in analog (such as a 3.5 mm stereo jack), component (“RCA”) stereo, or in a digital audio format such as S/PDIF, AES3, AES47, HDMI, USB, Bluetooth or Wi-Fi audio, by way of non-limiting example.
  • FIG. 3 is a block diagram of a server-class device 300 according to one or more examples of the present specification. Server 300 may be any suitable computing device or apparatus, as described in connection with FIG. 2. In general, the definitions and examples of FIG. 2 may be considered as equally applicable to FIG. 3, unless specifically stated otherwise. Server 300 is described herein separately to illustrate that in certain embodiments, logical operations according to this specification may be divided along a client-server model, wherein client device 200 provides certain localized tasks, while server 300 provides certain other centralized tasks. In contemporary practice, server 300 is more likely than client device 200 to be provided as a “headless” VM running on a computing cluster, or as a standalone appliance, though these configurations are not required.
  • Server 300 includes processor 310, which may include multiple cores, and which may be connected to a memory 320, having stored therein executable instructions for providing an operating system 322 and at least software portions of a server engine 324. Other components of server 300 include a storage 350, and network interface 360. As described in FIG. 2, each logical block may be provided by one or more similar or dissimilar logic elements.
  • In an example, processor 310 is communicatively coupled to memory 320 via memory bus 370-3, which may be for example a direct memory access (DMA) bus or other suitable memory interconnect. Processor 310 may be communicatively coupled to other devices via a system bus 370-1.
  • Processor 310 may be connected to memory 320 in a DMA configuration via DMA bus 370-3, or via any other suitable memory configuration. As discussed in FIG. 2, memory 320 may include one or more logic elements of any suitable type.
  • Storage 350 may be any species of memory 320, or may be a separate device, as described in connection with storage 250 of FIG. 2. Storage 350 may be, or may include therein, a database or databases or data stored in other configurations, and may include a stored copy of operational software such as operating system 322 and software portions of server engine 324.
  • Network interface 360 may be provided to communicatively couple server 140 to a wired or wireless network, and may include one or more logic elements as described in FIG. 2. Network interface 360 may be a NIC, HFI, or any other suitable device.
  • Server engine 324 is an engine as described in FIG. 2 and, in one example, includes one or more logic elements operable to carry out computer-implemented methods as described in this specification. Software portions of server engine 324 may run as a daemon process.
  • Server engine 324 may include one or more non-transitory computer-readable mediums having stored thereon executable instructions operable to instruct a processor to provide a security engine. At an appropriate time, such as upon booting server 140 or upon a command from operating system 322 or a user 120 or security administrator 150, processor 310 may retrieve a copy of server engine 324 (or software portions thereof) from storage 350 and load it into memory 320. Processor 310 may then iteratively execute the instructions of server engine 324 to provide the desired method.
  • Server engine 324 may be any suitable server program, including a daemon, for providing a network-based service. These could include, by way of nonlimiting example, an HTTP or web server, an e-mail server, an FTP server, telnet server, SSH server, NTP server, or any other suitable server. In certain embodiments, server engine 324 is the “application” of ALNQ as described herein. In certain embodiments, server engine 324 may perform the method of FIG. 8 or related methods.
  • A queuing engine 326 may also be provided, and may be an engine as described herein. Queuing engine 326 may be configured to provide ALNQ queuing services, including performing the method of FIG. 7 or related methods.
  • FIG. 4 is a block diagram of selected elements of a server 300 according to one or more examples of the present specification. In this example, NIC 360 connects server 300 to network 170, which in its turn is communicatively coupled to a plurality of end users devices 462. There may be many end-user devices 462, including thousands or more of end-user devices 462. End-user devices 462 access server 300 for the purpose of accessing services provided by server 300, such as via server engine 324 or any other appropriate application.
  • In this example, server 300 includes eight cores 310. Each core 310 may be dedicated to a different service or thread of a service as appropriate, though that is a nonlimiting example only, and it is also possible for processors 310 to share different processes. In this example, a web server 420 is provided. Web server 420 is divided into four separate threads. Specifically, web 0 420-0 runs on core 0 310-0. Web 1 420-1 runs on core 1 310-1. Web 2 420-2 runs on core 2 310-2. Web 3 420-3 runs on core 3 310-3.
  • An email server 430 is also provided. In this example, two cores are dedicated to email server 430. Specifically, email 0 430-0 runs on core 4 310-4. Email 1 430-1 runs on core 5 310-5.
  • The last two cores are committed to support background processes. Specifically, support applications 450 run on core 6 310-6. Operating system 322 runs on core 7 310-7. It should be understood that the number of cores, the number of threads, and the distribution of threads among the several cores is provided by way of nonlimiting example. In other embodiments, other configurations are possible. In some cases, a machine may have many more cores, and many more threads. In other cases, a machine may have fewer cores and fewer threads. Any suitable combination of cores and threads is available.
  • In this example, NIC 360 exposes queues 450 numbered 0 through 15. According to examples disclosed in this specification, queues 0 through 3, comprising group 452, are dedicated to traffic on port 80. Queues 4 and 5, comprising group 454, are dedicated to traffic on port 25. Queues 6 through 15 are considered auxiliary queues, and are allocated for use by operating system 322 and support application 450.
  • When an incoming packet comes from network 170 to NIC 360, queuing engine 326 inspects the packet in a top-down fashion. In other words, queuing engine 326 inspect the upper layer information first. In this example, queuing engine 326 is specifically looking at which logical port the packet is designated for. In an example, a packet is assigned to group 452 if and only if it is port 80 traffic. Thus, all port 80 traffic is assigned to group 452, and no traffic that is not port 80 traffic is assigned to group 452.
  • Similarly, if and only if incoming traffic is found to be email traffic on port 25, the traffic is assigned to group 454. In other words, all port 25 traffic is assigned to group 454, and no traffic that is not group that is not port 25 traffic is assigned to group 454.
  • Any traffic that is not either port 80 or port 25 is assigned to one of the auxiliary ports, numbered 6 through 15, according to standard assignment methods. This may include hashing or similar. This means that web server 420 knows that all of its traffic will appear in group 452, and that no outside traffic will appear in the group. Similarly, email server 430 knows that all of its traffic will appear in group 454, and that no outside traffic will appear in the group. Conversely, operating system 322 knows that those two groups are completely “owned” by their respective applications, so OS 322 can in appropriate circumstances circumvent some of its usual security procedures and let those applications handle their own internal security.
  • Assignment of traffic to auxiliary ports 6 through 15 may be accomplished with any suitable method, including in some embodiments prior art methods such as RSS, Flow Director, or similar. Similarly, once a packet has been assigned to either group 452 or 454, assignment of the packet to queues within the group may be accomplished by any suitable means, including RSS, Flow Director, or equivalent.
  • Advantageously, because it is known that group 453 includes only traffic on port 80 designated for web server 420, the operating system now has the flexibility to impose application specific policies and optimizations directly. As an example it could pass the queues directly to the application without having to consider security violations from having traffic from other applications.
  • FIG. 5 is a view of an additional embodiment according to one or more examples of the present specification. In the example of FIG. 5, groups 452 and 454 are twice as large as they are in the example of FIG. 4. This is to illustrate that while one-to-one mapping between queues and threads is possible, n-to-one mapping is also possible, for any n greater than 1.
  • In the example of FIG. 5, groups 452 and 454 both reserve two separate queues for each thread. This illustrates the principle that web server 420 and email server 430 may be relatively more important to the function of server 300 than support function 450 and operating system 322. Thus, it may be advantageous in certain embodiments to allocate additional queues for each thread, to avoid bottlenecks in the communication. Thus, in this example, queues 0 and 1 are reserved for web server thread 420-0. Queues 2 and 3 are for web server thread 420-1. Queues 4 and 5 are for web server thread 420-2. Queues 6 and 7 are for web server thread 420-3.
  • Similarly, queues 8 and 9 are for email server thread 430-0. Queues 10 and 11 are for email server thread 430-1.
  • Finally, queues 12 through 15 are allocated for general purposes, and may be used for support function 450 and operating system 322, including all traffic not on port 80 or 25.
  • Note that the examples of FIG. 4 and FIG. 5 are a nonlimiting example of a potential configuration. Other configurations are possible, including a configuration in which the number of queues exactly matches the number of cores, or where the number of queues is related to the number of cores by a non-integer function.
  • FIG. 6 is a block diagram illustration of a traditional OSI network stack model. In the traditional OSI model, the network stack includes seven layers, specifically physical layer 610, data link layer 620, network layer 630, transport layer 640, session layer 650, presentation layer 660, and application layer 670. The OSI model is presented here for purposes of illustration, but it should be understood that the OSI model is a nonlimiting example only. The “internet protocol suite” (often referred to as TCP/IP) is also commonly used. TCP/IP layers may be similar but not identical to OSI layers.
  • The seven layers of the OSI model are described briefly here for the purpose of context for relevant portions of the present description. The layers are, in summary:
      • a. Physical Layer 610—The physical layer (layer 1) provides networking hardware and transmission technologies. This layer includes physical wires, clocks, voltages, and other physical structures to provide the transmission of raw bits, rather than logical data packets. Each bit stream may be grouped into code words or symbols and converted to a physical signal that is transmitted over a hardware transmission medium. The physical layer provides an electrical, mechanical, and procedural interface to the transmission medium. Within the context of the seven-layer model, the physical layer translates logical communications requests from the data link layer into hardware-specific operations to effect transmission or reception of electronic signals.
      • b. Data Link Layer 620—The data link layer (layer 2) is the protocol layer that transfers data between adjacent network nodes to provide functional and procedural transfer of data between network entities. In certain embodiments, the data link layer also provides error detection and correction. The basic data unit at this layer is a data link frame, which is not intended to cross a local network boundary. Routing between networks and global addressing are performed at higher layers. This allows the data link layer to focus on local delivery, addressing, and media arbitration. Data-link protocols specify how devices detect and recover from collisions, and may provide mechanisms to reduce or prevent them. Thus, in certain existing architectures, queuing of packets is based on using hashes of values or absolute values available at this layer, such as a MAC address.
      • c. Network Layer 630Layer 3 is the network layer. The network layer is responsible for packet forwarding, including routing through intermediate routers. The network layer provides the functional and procedural means of transferring variable-length data sequences from a source to a destination host via one or more networks, while maintaining the quality of service functions. Functions of the network layer include:
        • i. Connection model: connectionless communication. For example, IP is connectionless, in that a datagram can travel from a sender to a recipient without the recipient having to send an acknowledgement. Connection-oriented protocols exist at other, higher layers of the OSI model.
        • ii. Host addressing. Every host in the network must have a unique address that determines where it is. This address is normally assigned from a hierarchical system. In TCP/IP, this is an IP address, either 32-bit IPv4 (e.g., 192.168.1.1), or 128-bit IPv6 (e.g., 2001:0DB8:AC10:FE01:0000:0000:0000:0000). An IPv4 address may be either globally unique, or locally unique (i.e., if it is used in a local subnetwork). An IPv6 address is intended to be globally unique.
        • iii. Message forwarding. Since many networks are partitioned into subnetworks and connect to other networks for wide-area communications, networks use specialized hosts, called gateways or routers, to forward packets between networks. This is also of interest to mobile applications, where a user may move from one location to another, and it must be arranged that his messages follow him.
        • iv. The network layer responds to service requests from the transport layer and issues service requests to the data link layer.
      • d. Transport Layer 640Layer 4 is the transport layer, which provides end-to-end or host-to-host communication services for applications. The transport layer provides services such as connection-oriented data stream support, reliability, flow control, and multiplexing. Simple connectionless (stateless) implementations may use user datagram protocol (UDP) at Layer 4, while connection-oriented (stateful) implementations may use transmission control protocol (TCP). Datagram Congestion Control Protocol (DCCP) and Stream Control Transmission Protocol (SCTP) are also provided by way of nonlimiting example.
      • e. Session Layer 650Layer 5 is the session layer. The session layer opens, closes, and manages sessions between end-user applications, such as end-user application 162 or client application 112 of FIG. 1 and a web server 420 of FIG. 4. Communication sessions consist of requests and responses that occur between applications. Session-layer services are commonly used in application environments that make use of remote procedure calls (RPCs). The session layer may respond to service requests from the presentation layer and issues service requests to the transport layer.
      • f. Presentation Layer 660Layer 6 is the presentation layer, also called the syntax layer. The presentation layer is responsible for the delivery and formatting of information to the application layer for further processing or display, including serialization of data structures into flat byte strings. This layer handles syntactical differences in data representation within the end-user systems. For example, EBCDIC-coded data may be converted to ASCII for use with an application. This layer also handles issues of diverse data types. For example, Pascal-style text strings, with an explicit integer length, may need to be interpreted differently from C-style null-terminated strings. Where appropriate, encryption may also be performed at this layer.
      • g. Application Layer 670Layer 7 is the application layer. In the OSI model, layer 7 is responsible for displaying received information to the user.
  • In certain existing networking architectures, incoming traffic to a network interface such as NIC 360, traffic is assigned to queues such as queues 450.
  • FIG. 7 is a flowchart of a method 700 performed by queuing engine 326, particularly for received packets, according to one or more examples of the present specification.
  • In block 720, NIC 360 receives an incoming packet on. This means that the physical interface receives electrical signals providing an incoming data packet.
  • In block 730, queuing engine 326 parses the packet in a top-down manner. Specifically, in this example, queuing engine 326 extracts the destination port for the packet, though other values may be used.
  • In block 740, queuing engine 326 determines whether the port is configured a “listening” port. If so, it identifies a number of queues in the group allocated for this port.
  • Block 750, queuing engine 326 load balances or distributes the packet to its assigned queue. For example, if group 452 has been designated for HTTP traffic, and if the incoming packet is on port 80, then queuing engine 326 may assign the packet to one of the cues in group 452 according to an appropriate load-balancing algorithm. RSS, Flow Director, or similar may be used to identify the specific queue to assign the packet into. The packet may then be made available via direct memory access (DMA), so that it can be accessed independently of the processor where appropriate.
  • In block 760, if interrupts are enabled (e.g., rather than “busy polling,”) and the interrupt throttling timer has expired, queuing engine 326 sends an interrupt to the host. In other words, the interrupt driver does not trigger an interrupt with every incoming packet. Rather, interrupts are triggered only when a threshold of packets or time has been reached. An interrupt service routine may then signal server engine 324 directly. Interrupts are asserted depending on the triggering logic, and if asserted signal the application. In either case, polling from server engine 324 may trigger the packet flow.
  • In block 780, queuing engine 326 sends a queuing notification to the appropriate process, such as a thread of server engine 324. For example, in the case of web server 420, queuing engine 326 may send a notification to the appropriate thread such as thread 0 420-0 of the process. Specifically, the driver may pick the packet from the queue, and pass along with the packet, opaque information that identified the specific queue. This information is stored in the sockets layer and available to the application.
  • In block 799, the method is done.
  • FIG. 8 is a flowchart of a method 800 performed by an application, such as server engine 324, according to one or more examples of the present specification.
  • In block 810, the application thread makes and asynchronous poll call. For example the Linux operating system provides a call known as EPOLL. If using EPOLL or similar, the application gets the opaque queue identifier for the socket it is watching for and checks to see if that it matches the queue identifier for other sockets it is watching for a particular event file descriptor. If so, it adds this socket to the watch list. Otherwise, it hands the socket to the event file descriptor that has the matching queue identifier and is added to its watch list. In an example, a single thread operates on a single event file descriptor. Thus, creating the match of watching sockets with the same queue identifier streamlines the traffic flow without interference from other applications.
  • In block 820, the application thread polls its assigned queues according to the socket/event file descriptors provided by asynchronous polling. If the thread was asleep because there is no data (and consequently, no work for it do), a wakeup may be performed directly on the sleeping thread, allowing the thread to come back and pull packets from its queues. In other words, protocol and descriptor processing occur in the context of the thread, not in the context of the interrupt, or deferred procedure call (DPC) (Windows)/SoftIRQ (Linux), which is scheduled as a result of an interrupt. In doing so, scheduling affinities are preserved. The scheduler wakeup of the application (such as server engine 326) keeps all shared kernel data structures associated with packet delivery and transmission in the same caching/scheduling domain, aligning with the scheduling policies of an operating system/hypervisor.
  • In block 830, the application thread performs busy pulling until there is a timeout. Busy polling sockets may be streamlined to pull packets directly from the application-specific queues.
  • In decision block 840, if data are available, then in block 870, method 900 of FIG. 9 is performed.
  • If data are not available, then in block 880, the thread goes to sleep with interrupts enabled.
  • In block 899, the method is done.
  • FIG. 9 is a flowchart of a method 900 disclosing additional details of data consumption according to one or more examples of the present specification.
  • In block 940, NIC 360 stamps a queue identifier token associated with a packet, and propagates the identifier up the network stack. Application 324 then receives the queue identifier from NIC 360. Note that if there is more than one NIC 360, this may also include a device identifier (for example, NIC 0, queue 1; NIC 1, queue 1; etc).
  • In block 960, the socket layer stores the queue identifier as part of its state. Again, the queue identifier may be device specific.
  • In block 980, application 324 consumes the data.
  • In block 999, the method is done.
  • FIG. 10 is a flow chart of a method 1000 of performing a transmit operation with ALNQ according to one or more examples of the present specification. This method is understood as a nonlimiting example, and in particular, specific network layers are assigned operations in this flow chart by way of illustration.
  • In block 1010, application 324 may set queue mapping for the socket, such as via extended socket options. Note that this may be necessary only if the send-to queue is different from the last transmit operation.
  • In block 1030, application 324 sends the packet using the normal socket SEND call.
  • In block 1040, the socket layer of the network stack appends the queue identifier to the sent packet.
  • In block 1050, the protocol and network layers perform their normal, known functions.
  • In block 1060, the device layer checks the queue identifier and places the packet on a specified queue for transmission.
  • In block 1080, the device layer signals the appropriate NIC 360 that the packet is ready for transmission.
  • In block 1090, NIC 360 transmits the packet.
  • In block 1099, the method is done.
  • The foregoing outlines features of several embodiments so that those skilled in the art may better understand various aspects of the present disclosure. Those skilled in the art should appreciate that they may readily use the present disclosure as a basis for designing or modifying other processes and structures for carrying out the same purposes and/or achieving the same advantages of the embodiments introduced herein. Those skilled in the art should also realize that such equivalent constructions do not depart from the spirit and scope of the present disclosure, and that they may make various changes, substitutions, and alterations herein without departing from the spirit and scope of the present disclosure. Note that client device 200 and server 300 may both be examples of computing apparatuses or other computing devices.
  • All or part of any hardware element disclosed herein may readily be provided in a system-on-a-chip (SoC), including central processing unit (CPU) package. An SoC represents an integrated circuit (IC) that integrates components of a computer or other electronic system into a single chip. Thus, for example, client devices 110 or server devices 300 may be provided, in whole or in part, in an SoC. The SoC may contain digital, analog, mixed-signal, and radio frequency functions, all of which may be provided on a single chip substrate. Other embodiments may include a multi-chip-module (MCM), with a plurality of chips located within a single electronic package and configured to interact closely with each other through the electronic package. In various other embodiments, the computing functionalities disclosed herein may be implemented in one or more silicon cores in Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), and other semiconductor chips.
  • Note also that in certain embodiment, some of the components may be omitted or consolidated. In a general sense, the arrangements depicted in the figures may be more logical in their representations, whereas a physical architecture may include various permutations, combinations, and/or hybrids of these elements. It is imperative to note that countless possible design configurations can be used to achieve the operational objectives outlined herein. Accordingly, the associated infrastructure has a myriad of substitute arrangements, design choices, device possibilities, hardware configurations, software implementations, and equipment options.
  • In a general sense, any suitably-configured processor, such as processor 310, can execute any type of instructions associated with the data to achieve the operations detailed herein. Any processor disclosed herein could transform an element or an article (for example, data) from one state or thing to another state or thing. In another example, some activities outlined herein may be implemented with fixed logic or programmable logic (for example, software and/or computer instructions executed by a processor) and the elements identified herein could be some type of a programmable processor, programmable digital logic (for example, a field programmable gate array (FPGA), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM)), an ASIC that includes digital logic, software, code, electronic instructions, flash memory, optical disks, CD-ROMs, DVD ROMs, magnetic or optical cards, other types of machine-readable mediums suitable for storing electronic instructions, or any suitable combination thereof.
  • In operation, a storage such as storage 350 may store information in any suitable type of tangible, non-transitory storage medium (for example, random access memory (RAM), read only memory (ROM), field programmable gate array (FPGA), erasable programmable read only memory (EPROM), electrically erasable programmable ROM (EEPROM), etc.), software, hardware (for example, processor instructions or microcode), or in any other suitable component, device, element, or object where appropriate and based on particular needs. Furthermore, the information being tracked, sent, received, or stored in a processor could be provided in any database, register, table, cache, queue, control list, or storage structure, based on particular needs and implementations, all of which could be referenced in any suitable timeframe. Any of the memory or storage elements disclosed herein, such as memory 320 and storage 350, should be construed as being encompassed within the broad terms ‘memory’ and ‘storage,’ as appropriate. A non-transitory storage medium herein is expressly intended to include any non-transitory special-purpose or programmable hardware configured to provide the disclosed operations, or to cause a processor such as processor 310 to perform the disclosed operations.
  • Computer program logic implementing all or part of the functionality described herein is embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, machine instructions or microcode, programmable hardware, and various intermediate forms (for example, forms generated by an assembler, compiler, linker, or locator). In an example, source code includes a series of computer program instructions implemented in various programming languages, such as an object code, an assembly language, or a high-level language such as OpenCL, Fortran, C, C++, JAVA, or HTML for use with various operating systems or operating environments, or in hardware description languages such as Spice, Verilog, and VHDL. The source code may define and use various data structures and communication messages. The source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form, or converted to an intermediate form such as byte code. Where appropriate, any of the foregoing may be used to build or describe appropriate discrete or integrated circuits, whether sequential, combinatorial, state machines, or otherwise.
  • In one example embodiment, any number of electrical circuits of the FIGURES may be implemented on a board of an associated electronic device. The board can be a general circuit board that can hold various components of the internal electronic system of the electronic device and, further, provide connectors for other peripherals. More specifically, the board can provide the electrical connections by which the other components of the system can communicate electrically. Any suitable processor and memory can be suitably coupled to the board based on particular configuration needs, processing demands, and computing designs. Other components such as external storage, additional sensors, controllers for audio/video display, and peripheral devices may be attached to the board as plug-in cards, via cables, or integrated into the board itself. In another example, the electrical circuits of the FIGURES may be implemented as stand-alone modules (e.g., a device with associated components and circuitry configured to perform a specific application or function) or implemented as plug-in modules into application specific hardware of electronic devices.
  • Note that with the numerous examples provided herein, interaction may be described in terms of two, three, four, or more electrical components. However, this has been done for purposes of clarity and example only. It should be appreciated that the system can be consolidated or reconfigured in any suitable manner. Along similar design alternatives, any of the illustrated components, modules, and elements of the FIGURES may be combined in various possible configurations, all of which are within the broad scope of this specification. In certain cases, it may be easier to describe one or more of the functionalities of a given set of flows by only referencing a limited number of electrical elements. It should be appreciated that the electrical circuits of the FIGURES and its teachings are readily scalable and can accommodate a large number of components, as well as more complicated/sophisticated arrangements and configurations. Accordingly, the examples provided should not limit the scope or inhibit the broad teachings of the electrical circuits as potentially applied to a myriad of other architectures.
  • Numerous other changes, substitutions, variations, alterations, and modifications may be ascertained to one skilled in the art and it is intended that the present disclosure encompass all such changes, substitutions, variations, alterations, and modifications as falling within the scope of the appended claims. In order to assist the United States Patent and Trademark Office (USPTO) and, additionally, any readers of any patent issued on this application in interpreting the claims appended hereto, Applicant wishes to note that the Applicant: (a) does not intend any of the appended claims to invoke paragraph six (6) of 35 U.S.C. section 112 (pre-AIA) or paragraph (f) of the same section (post-AIA), as it exists on the date of the filing hereof unless the words “means for” or “steps for” are specifically used in the particular claims; and (b) does not intend, by any statement in the specification, to limit this disclosure in any way that is not otherwise expressly reflected in the appended claims.
  • Example Implementations
  • There is disclosed in one example a network interface card (NIC), comprising: an ingress interface to receive incoming traffic; a plurality of queues to queue incoming traffic; an egress interface to direct incoming traffic to a plurality of server applications; and a queuing engine, including logic to: uniquely associate a queue with a selected server application; receive an incoming network packet; determine that the selected server application may process the incoming network packet; and assign the incoming network packet to the queue.
  • There is further disclosed an example NIC, wherein determining that the selected server application may process the incoming network packet comprises inspecting an application identifier of the incoming network packet.
  • There is further disclosed an example NIC, wherein the application identifier comprises a port number.
  • There is further disclosed an example NIC, further comprising means for notifying the selected server application of the assigning.
  • There is further disclosed an example NIC, further comprising means to enable the selected server application to bypass an interrupt mechanism and to perform per-thread, non-interfering busy polling.
  • There is further disclosed an example NIC, wherein the queuing engine is further to receive a queue mapping for a socket via an extended socket option.
  • There is further disclosed an example NIC, wherein the queueing engine further includes logic to uniquely associate a plurality of queues with the selected server application.
  • There is further disclosed an example NIC, wherein the queueing engine further includes logic to uniquely assign a second queue to a second selected server application.
  • There is further disclosed an example NIC, wherein the queueing engine further includes logic to assign a support queue to handle incoming packets not directed to a queue uniquely assigned to a server application.
  • There is further disclosed an example NIC, wherein the queueing engine further includes logic to assign a plurality of support queues.
  • There is further disclosed an example NIC, wherein the plurality of support queues number less than a sum of queues uniquely assigned to one or more selected server applications.
  • There is also disclosed in one example a tangible, non-transitory storage medium having stored thereon instructions to instruct a network interface card (NIC) to: deterministically associate an ingress queue of the NIC with a server function to be serviced by the NIC; receive an incoming network packet on an ingress interface of the NIC; determine that the server function may process the incoming network packet; and assign the incoming network packet to the queue.
  • There is further disclosed an example of a tangible, non-transitory storage medium, wherein determining that the selected server application may process the incoming network packet comprises inspecting an application identifier of the incoming network packet.
  • There is further disclosed an example of a tangible, non-transitory storage medium, wherein the application identifier comprises a port number.
  • There is further disclosed an example of a tangible, non-transitory storage medium, wherein the queueing engine further includes logic to uniquely associate a plurality of queues with the selected server application.
  • There is further disclosed an example of a tangible, non-transitory storage medium, wherein the queueing engine further includes logic to uniquely assign a second queue to a second selected server application.
  • There is further disclosed an example of a tangible, non-transitory storage medium, wherein the queueing engine further includes logic to assign a support queue to handle incoming packets not directed to a queue uniquely assigned to a server application.
  • There is also disclosed an example method of providing deterministic per-application packet queueing, comprising: uniquely associating one or more ingress queues of a network interface card (NIC) with a server function to be serviced by the NIC; receiving an incoming network packet on an ingress interface of the NIC; determining that the network packet is directed to the associated server function; and assigning the incoming network packet to a queue selected from the one or more uniquely associated queues.
  • There is further disclosed an example method, wherein determining that the network packet is directed to the associated server function comprises inspecting an application identifier of the incoming network packet.
  • There is further disclosed an example method, wherein the application identifier comprises a port number.

Claims (20)

What is claimed is:
1. A network interface card (NIC), comprising:
an ingress interface to receive incoming traffic;
a plurality of queues to queue incoming traffic;
an egress interface to direct incoming traffic to a plurality of server applications; and
a queuing engine, including logic to:
uniquely associate a queue with a selected server application;
receive an incoming network packet;
determine that the selected server application may process the incoming network packet; and
assign the incoming network packet to the queue.
2. The NIC of claim 1, wherein determining that the selected server application may process the incoming network packet comprises inspecting an application identifier of the incoming network packet.
3. The NIC of claim 2, wherein the application identifier comprises a port number.
4. The NIC of claim 1, further comprising means for notifying the selected server application of the assigning.
5. The NIC of claim 1, further comprising means to enable the selected server application to bypass an interrupt mechanism and to perform per-thread, non-interfering busy polling.
6. The NIC of claim 1, wherein the queuing engine is further to receive a queue mapping for a socket via an extended socket option.
7. The NIC of claim 1, wherein the queueing engine further includes logic to uniquely associate a plurality of queues with the selected server application.
8. The NIC of claim 1, wherein the queueing engine further includes logic to uniquely assign a second queue to a second selected server application.
9. The NIC of claim 1, wherein the queueing engine further includes logic to assign a support queue to handle incoming packets not directed to a queue uniquely assigned to a server application.
10. The NIC of claim 9, wherein the queueing engine further includes logic to assign a plurality of support queues.
11. The NIC of claim 10, wherein the plurality of support queues number less than a sum of queues uniquely assigned to one or more selected server applications.
12. A tangible, non-transitory storage medium having stored thereon instructions to instruct a network interface card (NIC) to:
deterministically associate an ingress queue of the NIC with a server function to be serviced by the NIC;
receive an incoming network packet on an ingress interface of the NIC;
determine that the server function may process the incoming network packet; and
assign the incoming network packet to the queue.
13. The tangible, non-transitory storage medium of claim 12, wherein determining that the selected server application may process the incoming network packet comprises inspecting an application identifier of the incoming network packet.
14. The tangible, non-transitory storage medium of claim 13, wherein the application identifier comprises a port number.
15. The tangible, non-transitory storage medium of claim 12, wherein the queueing engine further includes logic to uniquely associate a plurality of queues with the selected server application.
16. The tangible, non-transitory storage medium of claim 12, wherein the queueing engine further includes logic to uniquely assign a second queue to a second selected server application.
17. The tangible, non-transitory storage medium of claim 12, wherein the queueing engine further includes logic to assign a support queue to handle incoming packets not directed to a queue uniquely assigned to a server application.
18. A method of providing deterministic per-application packet queueing, comprising:
uniquely associating one or more ingress queues of a network interface card (NIC) with a server function to be serviced by the NIC;
receiving an incoming network packet on an ingress interface of the NIC;
determining that the network packet is directed to the associated server function; and
assigning the incoming network packet to a queue selected from the one or more uniquely associated queues.
19. The method of claim 18, wherein determining that the network packet is directed to the associated server function comprises inspecting an application identifier of the incoming network packet.
20. The method of claim 19, wherein the application identifier comprises a port number.
US16/773,801 2015-12-26 2020-01-27 Application-level network queueing Abandoned US20200162396A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US16/773,801 US20200162396A1 (en) 2015-12-26 2020-01-27 Application-level network queueing
US17/566,381 US11706151B2 (en) 2015-12-26 2021-12-30 Application-level network queueing
US18/201,068 US20230300078A1 (en) 2015-12-26 2023-05-23 Application-level network queueing

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/998,138 US10547559B2 (en) 2015-12-26 2015-12-26 Application-level network queueing
US16/773,801 US20200162396A1 (en) 2015-12-26 2020-01-27 Application-level network queueing

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/998,138 Continuation US10547559B2 (en) 2015-12-26 2015-12-26 Application-level network queueing

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/566,381 Continuation US11706151B2 (en) 2015-12-26 2021-12-30 Application-level network queueing

Publications (1)

Publication Number Publication Date
US20200162396A1 true US20200162396A1 (en) 2020-05-21

Family

ID=59086736

Family Applications (4)

Application Number Title Priority Date Filing Date
US14/998,138 Active US10547559B2 (en) 2015-12-26 2015-12-26 Application-level network queueing
US16/773,801 Abandoned US20200162396A1 (en) 2015-12-26 2020-01-27 Application-level network queueing
US17/566,381 Active 2036-01-01 US11706151B2 (en) 2015-12-26 2021-12-30 Application-level network queueing
US18/201,068 Pending US20230300078A1 (en) 2015-12-26 2023-05-23 Application-level network queueing

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/998,138 Active US10547559B2 (en) 2015-12-26 2015-12-26 Application-level network queueing

Family Applications After (2)

Application Number Title Priority Date Filing Date
US17/566,381 Active 2036-01-01 US11706151B2 (en) 2015-12-26 2021-12-30 Application-level network queueing
US18/201,068 Pending US20230300078A1 (en) 2015-12-26 2023-05-23 Application-level network queueing

Country Status (5)

Country Link
US (4) US10547559B2 (en)
EP (1) EP3395022A4 (en)
JP (1) JP6513627B2 (en)
CN (3) CN114615212A (en)
WO (1) WO2017112326A1 (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10547559B2 (en) 2015-12-26 2020-01-28 Intel Corporation Application-level network queueing
EP3437256A4 (en) 2016-03-25 2019-10-30 Nebbiolo Technologies, Inc. Fog computing facilitated flexible factory
US10798063B2 (en) 2016-10-21 2020-10-06 Nebbiolo Technologies, Inc. Enterprise grade security for integrating multiple domains with a public cloud
US10191911B2 (en) * 2017-05-27 2019-01-29 Plesk International Gmbh Permanent website hosting on mobile devices
US10747632B2 (en) * 2017-08-11 2020-08-18 T-Mobile Usa, Inc. Data redundancy and allocation system
US10972579B2 (en) * 2017-10-13 2021-04-06 Nebbiolo Technologies, Inc. Adaptive scheduling for edge devices and networks
US10735341B2 (en) 2018-04-26 2020-08-04 Nicira, Inc. Dynamic provisioning of multiple RSS engines
US10686716B2 (en) 2018-07-23 2020-06-16 Vmware, Inc. Dynamic processing of packets using multiple receive queue features
US11025546B2 (en) 2018-07-25 2021-06-01 Vmware, Inc. Selection of paired transmit queue
US11366753B2 (en) * 2018-07-31 2022-06-21 Marvell Asia Pte Ltd Controlling performance of a solid state drive
US11343193B2 (en) * 2020-01-03 2022-05-24 Realtek Singapore Private Limited Apparatus and method for rate management and bandwidth control
US20200241927A1 (en) * 2020-04-15 2020-07-30 Intel Corporation Storage transactions with predictable latency
CN113965561B (en) * 2021-10-20 2023-08-25 中电科航空电子有限公司 Asynchronous event driven-based airborne file transmission system
JP2023082442A (en) 2021-12-02 2023-06-14 キオクシア株式会社 Information processing system and memory system
CN115396386B (en) * 2022-08-09 2023-11-17 伟志股份公司 Data sharing system, method and application thereof

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070067324A1 (en) * 2005-09-19 2007-03-22 International Business Machines Corporation Real Time Work Queue Notification
US20070130367A1 (en) * 2005-11-18 2007-06-07 Vivek Kashyap Inbound connection prioritization
US20110142064A1 (en) * 2009-12-15 2011-06-16 Dubal Scott P Dynamic receive queue balancing
US20130077486A1 (en) * 2011-06-27 2013-03-28 Seth Keith Prioritizing classes of network traffic to provide a predetermined quality of service
US20130204965A1 (en) * 2012-02-03 2013-08-08 Cahya Masputra Packet transmission on a client using implicit enabling of features based on service classifications
US20140351206A1 (en) * 2013-05-21 2014-11-27 Samsung Electronics Co., Ltd. Method and apparatus for application management
US20150334008A1 (en) * 2014-05-15 2015-11-19 Samsung Electronics Co., Ltd. Method of distributing data and device supporting the same
US20160378545A1 (en) * 2015-05-10 2016-12-29 Apl Software Inc. Methods and architecture for enhanced computer performance

Family Cites Families (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5682553A (en) * 1995-04-14 1997-10-28 Mitsubishi Electric Information Technology Center America, Inc. Host computer and network interface using a two-dimensional per-application list of application level free buffers
US6256660B1 (en) * 1997-04-08 2001-07-03 International Business Machines Corporation Method and program product for allowing application programs to avoid unnecessary packet arrival interrupts
US6295532B1 (en) * 1999-03-02 2001-09-25 Nms Communications Corporation Apparatus and method for classifying information received by a communications system
AU2001276000A1 (en) * 2000-08-31 2002-03-13 Netrake Corporation Method for enforcing service level agreements
US9525696B2 (en) * 2000-09-25 2016-12-20 Blue Coat Systems, Inc. Systems and methods for processing data flows
JP4403348B2 (en) 2000-12-14 2010-01-27 ソニー株式会社 Communication apparatus and communication method
US20050002405A1 (en) * 2001-10-29 2005-01-06 Hanzhong Gao Method system and data structure for multimedia communications
CN1322752C (en) * 2001-12-26 2007-06-20 好利集团有限公司 Internet protocol interdynamic TV system and its method for implementing it
US7702739B1 (en) 2002-10-01 2010-04-20 Bao Tran Efficient transactional messaging between loosely coupled client and server over multiple intermittent networks with policy based routing
US8347313B2 (en) * 2003-05-21 2013-01-01 Resilient Networks, Inc. Method and apparatus for automating organization of processes
US7703138B2 (en) * 2004-12-29 2010-04-20 Intel Corporation Use of application signature to identify trusted traffic
US7606147B2 (en) * 2005-04-13 2009-10-20 Zeugma Systems Inc. Application aware traffic shaping service node positioned between the access and core networks
US8645558B2 (en) * 2005-06-15 2014-02-04 Solarflare Communications, Inc. Reception according to a data transfer protocol of data directed to any of a plurality of destination entities for data extraction
US8737579B2 (en) * 2005-08-03 2014-05-27 At&T Intellectual Property I, L.P. Methods of routing messages using a listener registry
US20070067770A1 (en) * 2005-09-16 2007-03-22 Thomasson Christopher M System and method for reduced overhead in multithreaded programs
WO2007138250A2 (en) * 2006-05-25 2007-12-06 Solarflare Communications Incorporated Computer system with lock- protected queues for sending and receiving data
US20070280105A1 (en) * 2006-05-31 2007-12-06 Omri Barkay Enabling client QoS middle layer based on application recognition
EP2552080B1 (en) * 2006-07-10 2017-05-10 Solarflare Communications Inc Chimney onload implementation of network protocol stack
JP2008020977A (en) * 2006-07-11 2008-01-31 Sony Computer Entertainment Inc Network processor system and network protocol processing method
JP4872952B2 (en) * 2008-03-06 2012-02-08 日本電気株式会社 TCP buffer copy distributed parallel processing apparatus, method and program
US8619563B2 (en) * 2009-02-03 2013-12-31 Qualcomm Incorporated Method and apparatus for interference management in a wireless communication system
US8856783B2 (en) * 2010-10-12 2014-10-07 Citrix Systems, Inc. Allocating virtual machines according to user-specific virtual machine metrics
US8238324B2 (en) * 2009-07-24 2012-08-07 Broadcom Corporation Method and system for network aware virtual machines
US8462797B2 (en) * 2009-11-30 2013-06-11 Alcatel Lucent Method of priority based transmission of wireless video
CN102088470B (en) * 2009-12-07 2014-03-05 中兴通讯股份有限公司 Method and device for deadlock detection and self-recovery of file transfer protocol (FTP) in communication system
WO2011096307A1 (en) * 2010-02-03 2011-08-11 日本電気株式会社 Proxy device and operation method thereof
US8527993B2 (en) * 2010-06-01 2013-09-03 Qualcomm Incorporated Tasking system interface methods and apparatuses for use in wireless devices
US8547847B2 (en) * 2010-07-22 2013-10-01 International Business Machines Corporation Transparent quality of service
US8990380B2 (en) * 2010-08-12 2015-03-24 Citrix Systems, Inc. Systems and methods for quality of service of ICA published applications
US8638795B2 (en) * 2010-08-12 2014-01-28 Citrix Systems, Inc. Systems and methods for quality of service of encrypted network traffic
US8433783B2 (en) 2010-09-29 2013-04-30 Citrix Systems, Inc. Systems and methods for providing quality of service via a flow controlled tunnel
WO2012132263A1 (en) 2011-03-28 2012-10-04 パナソニック株式会社 Repeater, method for controlling repeater, and program
US8705363B2 (en) * 2011-04-05 2014-04-22 Telefonaktiebolaget L M Ericsson (Publ) Packet scheduling method and apparatus
US8924501B2 (en) * 2011-11-30 2014-12-30 Red Hat Israel, Ltd. Application-driven shared device queue polling
US8990901B2 (en) 2012-05-05 2015-03-24 Citrix Systems, Inc. Systems and methods for network filtering in VPN
WO2014002500A1 (en) 2012-06-29 2014-01-03 Nec Corporation Qos control system, qos control method, and program
US9513950B2 (en) * 2012-07-25 2016-12-06 Vmware, Inc. Dynamic resource configuration based on context
US9094460B2 (en) * 2012-09-06 2015-07-28 Unisys Corporation Socket tables for fast data packet transfer operations
US9762939B2 (en) * 2012-09-28 2017-09-12 Marvell World Trade Ltd. Enhanced user experience for miracast devices
US8868800B2 (en) 2013-03-12 2014-10-21 Empire Technology Development Llc Accelerator buffer access
US9450881B2 (en) * 2013-07-09 2016-09-20 Intel Corporation Method and system for traffic metering to limit a received packet rate
US9558132B2 (en) 2013-08-14 2017-01-31 Intel Corporation Socket management with reduced latency packet processing
US9571426B2 (en) 2013-08-26 2017-02-14 Vmware, Inc. Traffic and load aware dynamic queue management
US9241044B2 (en) * 2013-08-28 2016-01-19 Hola Networks, Ltd. System and method for improving internet communication by using intermediate nodes
US9378047B1 (en) * 2013-09-18 2016-06-28 Emc Corporation Efficient communication of interrupts from kernel space to user space using event queues
US9619286B1 (en) * 2014-03-20 2017-04-11 EMC IP Holding Company LLC Using request service time as fairness heuristic in locking techniques
US9778963B2 (en) * 2014-03-31 2017-10-03 Solarflare Communications, Inc. Ordered event notification
WO2015199366A1 (en) * 2014-06-26 2015-12-30 정기웅 Method for scheduling in multiprocessing environment and device therefor
US9930133B2 (en) * 2014-10-23 2018-03-27 Netapp, Inc. System and method for managing application performance
US9921991B2 (en) * 2015-11-24 2018-03-20 Nicira, Inc. Systems and methods for flipping NIC teaming configuration without interfering live traffic
US10547559B2 (en) 2015-12-26 2020-01-28 Intel Corporation Application-level network queueing

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070067324A1 (en) * 2005-09-19 2007-03-22 International Business Machines Corporation Real Time Work Queue Notification
US20070130367A1 (en) * 2005-11-18 2007-06-07 Vivek Kashyap Inbound connection prioritization
US20110142064A1 (en) * 2009-12-15 2011-06-16 Dubal Scott P Dynamic receive queue balancing
US20130077486A1 (en) * 2011-06-27 2013-03-28 Seth Keith Prioritizing classes of network traffic to provide a predetermined quality of service
US20130204965A1 (en) * 2012-02-03 2013-08-08 Cahya Masputra Packet transmission on a client using implicit enabling of features based on service classifications
US20140351206A1 (en) * 2013-05-21 2014-11-27 Samsung Electronics Co., Ltd. Method and apparatus for application management
US20150334008A1 (en) * 2014-05-15 2015-11-19 Samsung Electronics Co., Ltd. Method of distributing data and device supporting the same
US20160378545A1 (en) * 2015-05-10 2016-12-29 Apl Software Inc. Methods and architecture for enhanced computer performance

Also Published As

Publication number Publication date
US20230300078A1 (en) 2023-09-21
US20220124047A1 (en) 2022-04-21
US10547559B2 (en) 2020-01-28
JP2017117448A (en) 2017-06-29
EP3395022A4 (en) 2019-06-19
CN106921590A (en) 2017-07-04
WO2017112326A1 (en) 2017-06-29
CN114745341A (en) 2022-07-12
CN114615212A (en) 2022-06-10
EP3395022A1 (en) 2018-10-31
US20170187640A1 (en) 2017-06-29
US11706151B2 (en) 2023-07-18
JP6513627B2 (en) 2019-05-15

Similar Documents

Publication Publication Date Title
US11706151B2 (en) Application-level network queueing
US10812378B2 (en) System and method for improved service chaining
US9602636B1 (en) Stateless packet segmentation and processing
US9712538B1 (en) Secure packet management for bare metal access
US8640220B1 (en) Co-operative secure packet management
CN107771320B (en) System and method for improving security of Secure Socket Layer (SSL) communications
US10769152B2 (en) Automated log analysis
CN108459988B (en) System and method for persistent remote direct memory access
US8155146B1 (en) Stateless packet segmentation and processing
US10721273B2 (en) Automated security policy
US8102881B1 (en) Streamlined guest networking in a virtualized environment
US10523745B2 (en) Load balancing mobility with automated fabric architecture
WO2007002710A1 (en) Offload stack for network, block and file input and output
US10911405B1 (en) Secure environment on a server
US20230185732A1 (en) Transparent encryption
US10091112B1 (en) Highly-scalable virtual IP addresses in a load balancing switch
US20180217823A1 (en) Tightly integrated accelerator functions
US12107776B2 (en) Adjustable bit mask for high-speed native load balancing on a switch
US20200021528A1 (en) Tcam-based load balancing on a switch

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION