ISIA – Data Transmission (2)

High-capacity data transmission

E.g: high-resolution images, sound, video

  • Internet
    • wired networks
    • Wi-Fi access points: local-area mobility
    • ‘free’ routing across entire Internet
  • Packet-switched cellular radio network
    • wide-area mobility
    • 2G: General Packet Radio Service (GPRS) 177 kbit/s
    • 4G: WiMAX (802.16e) 34 Mbit/s – 1 Gbit/s
    • ‘free’ bridge to Internet

Low-capacity data transmission

E.g: weather information, slow-moving GPS location, intermittent conditions (flood, fire)

  • Global System for Mobile communications (GSM) Short Message Service (SMS)
    • text data transmission
      • ‘rich’ binary content requires encoding
    • wide-area mobility
    • machine-to-machine only (one call originator, one call receiver)
  • Ideal ‘fail-over’ service method during interruptions to primary method
  • Can be used to ‘wake up’ sleeping low-power remote station
    • e.g., to establish high-capacity (high-power) network connection

Communication protocols

  • A system of rules governing the exchange of information between two communicators
    • agreed upon by all parties involved
  • Specific messages sent
  • Specific actions taken
    • specific responses sent when messages received
  • A protocol defines
    • syntax — how the transmitted data is physically arranged
      • fields for meta-data (describing and controlling the communication)
      • payload (the information being exchanged)
    • semantics — what the different parts of the transmitted data mean
    • behavior — error recovery mechanisms
  • Protocol may be implemented by hardware, software, or a combination of both
  • Example: ‘8E2’ RS-232 serial communication
    • meta-data: header = start bit
    • payload: 8 bits of data
    • meta-data: trailer = even parity bit, two stop bits

Communication protocol challenges

  • Deal with content or size restrictions; e.g., for SMS:
    • maximum single-segment message size is 160 7-bit ASCII characters
    • maximum multiple-segment message size is 153 7-bit characters per segment
      • maximum 35 segments per message = 5355 character limit
      • limits may depend on service provider
      • multiple-segment messages are charged per segment
  • Route the data to the correct recipient
    • destination computer (which might be running many communication programs), and
    • destination program (or background ‘daemon’ process)
      (using a human-friendly name to identify the computer)
  • Provide reliable, secure delivery
    • data arrives intact and in the correct order
    • data cannot be intercepted by ‘man-in-the-middle’ attacker

Network protocol design

  • Two critical ideas
    • data encapsulation
    • the end-to-end principle
  • Encapsulation: logically separate network functions are hidden from each other
    • the network is organized as a set of layers, each providing a single function
    • a given layer may be included or omitted, or have multiple possible functions
    • lower layers do not interpret the data from higher layers
  • The end-to-end principle
    “only the application knows what communication services the application requires”
    • the network should not provide (nor add any overhead whatsoever for)
      • reliable delivery, encryption, content encoding, compression, etc.
        (pick any combination)
    • to applications that don’t need it

Encapsulation

  • OSI seven-layer model of networks: each layer
    • provides a service to the layer(s) above it
  • Each layer
    • adds its own meta-data (header, trailer) to the message
    • without modifying the data from the layer above

End-to-end principle

  • Each layer
    • communicates only with its corresponding layer on the remote machine
    • implements the minimum mechanism required for the service it provides

Addresses: MAC

  • Link layer
    • every connected device has a medium access control (MAC) address
    • local scope; must be unique on the LAN
    • usually never seen by end users
    • frames are sent to MAC addresses

Addresses: IP

  • Network layer
    • every remotely-accessible machine has an Internet Protocol (IP) address
    • global scope; must be unique in the entire Internet
    • Internet packets are sent to IP addresses

Addresses: port

  • Transport layer
    • every protocol must have software that can handle it, identified by port number
    • local scope; must be unique on a single machine (identifies a server process)
    • connections are made to specific ports on a remote machine
      • most protocols have a standard default port
      • e.g., hypertext transfer protocol (HTTP) Web server is usually on port 80

Uniform resource identifiers

  • Common way to write addresses in a human-readable form
    scheme:[//[user[:password]@]host[:port]]
    (where ‘[x]’ indicates an optional component x)
  • scheme indicates the protocol (and implies a default port)
    • if one protocol encapsulates another, multiple schemes are separated by + signs
    • e.g., svn+ssh indicates the svn protocol (port 3690) will be used encapsulated within ssh (port 22)
  • user, password specify authority (login) credentials for the connection
  • host specifies an IP address, symbolically or numerically
  • port specifies which process (server) is to be contacted
    • default value depends on the scheme (defaults listed in the file /etc/services)
  • The above parts of the URI are interpreted at the sending side
    • selects which protocol and port to use, and which remote machine to contact
    • may also be used to select which application to run; e.g., in a Mac OS X terminal, try:
      open http://www.google.com (opens your default web browser)
      open ftp://speedtest.tele2.net (opens a Finder window)
      open vnc://localhost (tries (in vain) to open screen sharing)

Uniform resource identifiers

  • A request for a resource or action optionally follows the address
    scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]
  • path names some hierarchical resource or action desired of the server
    • must begin with a slash‘/’ (but cannot begin with two slashes ‘//’)
    • looks like a POSIX (BSD, MacOS, Linux) path (file) name
      • additional ‘/’ characters used to specify a resource hierarchically, e.g:
        http://some.machine.net/blog/uploads/2018/05/02/isa04.pdf
  • query permits adding key-value parameters to the request separated by &
    • path?key1=value1&key2=value2&
      e.g: http://www.google.com/search?q=tcp%2Fip
  • fragment permits specifying a section or index within the resource
    • web browsers use the fragment to scroll the page to the desired location
    • note: in that case the fragment is interpreted by the client, not the server

Communication protocol solutions

  • Route the data to the correct recipient
    • network layer forwards packets according to their destination IP address
    • if the IP address matches the local machine, deliver the payload
    • otherwise send it one hop closer to its destination
      • to the next (directly-connected) gateway or router
  • Provide reliable, secure delivery
    • transport layer adds sequence and acknowledgement numbers
    • detects out-of-order, duplicated, or dropped packets
    • requests retransmission when necessary
  • Deal with content or size restrictions; e.g., for SMS or e-mail:
    • presentation layer encodes binary content using safe characters

Presentation layer: content encoding

  • Problem
    • need to send binary data or international text (e.g., UTF-8)
    • application only supports plain text (e.g., e-mail, SMS)
  • Solution
    • encode content using only the application’s supported character set
  • Two commonly-used schemes
    • ‘percent’ or ‘URL’ (hexadecimal) encoding
    • base64 encoding

URL (percent, hexadecimal, base16) encoding

  • Unsafe characters encoded as a three-character sequence
    • a ‘%’ character, indicating the start of an encoded character
    • the most significant four bits of the character, as a hexadecimal digit
    • the least significant four bits of the character, as a hexadecimal digit
  • Example: web page URLs must be 7-bit ASCII, and use ‘/’ as a separator
  • • replace literal ‘/’, and other unsafe characters, in URLs with ‘%XX’ (hex code)
    • e.g: perform a Google search for ‘tcp/ip’ (where 2F is the hex code for ‘/’)
      http://www.google.com/search?q=tcp%2Fip
    • also works for encoding UTF-8 characters in URLs
  • “hello, world” in URL encoding:
    %68%65%6C%6C%6F%2C%20%77%6F%72%6C%64
  • Efficiency
  • • the output is a factor of 3 larger than the input

Base64 encoding

  • Instead of 4 bits at a time, encode 6 bits at a time
    • requires 64 different ‘digits’ for values 0 . . . 63
    • ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
      (these are available in almost all existing computer character sets)
  • To encode, consider the entire message as a single sequence of bits
    • encode the first six bits as a single ‘digit’ and print it
    • remove the first six bits from the message
    • repeat until fewer than 6 bits remain
    • if more than 0 bits remain
      • add as many 0 bits to the right until 6 bits are available
      • encode as a single digit and print it
  • “hello, world” in base64 encoding: aGVsbG8sIHdvcmxkCg
  • Efficiency
    • six input bits are encoded as 8 output bits (one character)
    • the output is therefore a factor of 1.333 larger than the input

base64 encoder (C version)

char table[64] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

unsigned out = 0, pos = 0;

void shiftout(unsigned byte)					
{												
	for (int i= 0; i < 8; ++i) {					
		out <<= 1;									
		out |= ((byte >> 7) & 1);					
		byte <<= 1;									
		pos += 1;									
		if (pos == 6) 
			putchar(table[out]);
			out= 0;
			pos= 0;
		}
	}
}

void finish(void)
{
	if (!pos) return;
	out <<= (6 - pos);
	putchar(table[out]);
	out= 0;
	pos= 0;
}

int main(int argc, char **argv) {
	int c;
	while (-1 != (c = getchar())) shiftout(c);
	finish();
	return 0;
}

base64 encoder (Python version)

import sys

table = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"

out = 0
pos = 0

def shiftout(bits):
	global out, pos
	for i in range(8):
		out <<= 1
		out |= (bits >> 7) & 1
		bits <<= 1
		pos += 1
		if pos == 6:
			sys.stdout.write(table[out])
			out = 0
			pos = 0

def finish():
	global out, pos
	if pos:
		out <<= (6 - pos)
		sys.stdout.write(table[out])
		out = 0
		pos = 0

while 1:
	c = sys.stdin.read(1)
	if not c: break
	shiftout(ord(c))

finish()

base64 decoding

  • perform the opposite of encoding
    • read a character and find its index in the encoding table
    • append the six bits of its index to the output
    • when 8 bits have been appended, output those 8 bits as a character
    • repeat until no more input remains

Summary

applicationthe end-user’s application that produces and consumes information, e.g., web server and web browser, or remote sensor and data logger
presentationtranslation of end-user content to/from a network-friendly format, e.g., binary files (images, audio) to base64 encoding for the (ASCII-only) e-mail system
sessioncommunication handshake, authentication, shutdown
transportreliable and/or secure transmission of messages
networkrouting of packets through the network to remote hosts (WAN)
data linkdelivery of packets to directly-connected devices (LAN)
physicaltransmission and reception of bits over a physical medium

(mnemonic: all people seem tneed data processing)

  • Each layer provides a communication service to the layer above it
  • Each layer exchanges meta-data and payload with its corresponding layer on the remote machine
  • Each layer uses the layer below it for primitive communication services
  • Presentation layer
    • encodes and decodes content
    • security (encryption), compatibility (binary content)
  • Transport layer
    • provides process-to-process (client-to-server) communication for applications
    • can add reliable delivery, security, etc., to network layer service
  • Network layer
    • provides unreliable host-to-host delivery of packets of data
    • handles routing of packets to their final destination
  • Important things we did not have time for
    • the domain name system (DNS)
      • how symbolic names are translated to IP addresses
    • routing
      • how the network layer knows to where a packet should be forwarded
    • multi-cast and broadcast
      • how to send data to many clients, without overloading the network
  • However, to use the network it is much more important to know
    • how the network layers operate with each other
    • how encapsulation works
    • how the end-to-end principle is applied
  • And
    • how to implement your own service, e.g., in the presentation layer