Date: Thu, 21 Nov 96 10:19:00 PST From: Vineet Kumar To: gary_shirk@ccm.jf.intel.com, iana@isi.edu Subject: need payload type assigned to G.723.1 audio codec G.723.1 is an audio codec used in ITU's H.323 and H.324. It is being widely used for Internet Telephony. Intel's own Internet Phone product has over 300,000 (three-hundred-thousand) downloads. Microsoft's Netmeeting also implements H.323 with G.723.1 and is expected to reach millions of consumers. Due to such a demand it makes a lot of sense to have a static payload type assigned to G.723.1. -------------------------------------------------------------------- About G.723.1 --------------- G.723.1 is specified in ITU recommendation G.723.1. Reference implementations for G.723.1 are available as part of the CCITT/ITU-T Software Tool Library (STL) from the ITU General Secretariat, Sales Service, Place du Nations, CH-1211 Geneve 20, Switzerland. The library is covered by a license. This Recommendation specifies a coded representation that can be used for compressing the speech or other audio signal component of multi-media services at a very low bit rate. A G.723.1 frame can be one of three sizes: 24 bytes (6.3 kb/s frame), 20 bytes (5.3 kb/s frame), or 4 bytes. These 4-byte frames are called SID frames (Silence Insertion Descriptor) and are used to specify comfort noise parameters. There is no restriction on how 4, 20, and 24 byte frames are intermixed. The first two bits in the frame determine the frame boundary. It is possible to switch between the two rates at any 30 ms frame boundary. Both (5.3 kb/s and 6.3 kb/s) rates are a mandatory part of the encoder and decoder. This coder was optimized to represent speech with a high quality at the above rates using a limited amount of complexity. Conformance to RFC 1890 ------------------------ G.723.1 packetization conforms to RFC 1890 except for the packetization interval (30 ms vs. 20 ms default): 1. The first packet of a talkspurt (first packet after a silence period) is distinguished by setting the marker bit in the RTP data header. 2. The sampling frequency (RTP clock frequency) is 8000 Hz. 3. The packetization interval should have a duration of 30 ms (one frame) as opposed to the default packetization of 20 ms. 4. Codecs should be able to encode and decode several consecutive frames within a single packet. 5. A receiver should accept packets representing between 0 and 180 ms of audio data as opposed to the default of 0 and 200 ms. Bibliography ------------- 1. International Telecommunication Union (ITU-T), "Recommendation G.723.1: Dual Rate Speech Coder for Multimedia Communications transmitting at 5.3 and 6.3 kbits/s", Geneva, Switzerland, March 1996. (http://www.itu.ch). 2. H. Schulzrinne, "RTP Profile for Audio and Video Conferences with Minimal Control", RFC 1890, GMD Fokus, January 1996. Author's Address ------------------- Vineet Kumar Intel Corporation, JF3-212 2111 NE 25th Avenue Hillsboro, OR 97124-5961 USA Phone: +1 (503) 264-3439 EMail: vineet_kumar@ccm.jf.intel.com