INTEL 80387 PROGRAMMER'S REFERENCE MANUAL 1987 MARCOM DISCLAIMER -- New word: Intel Certified, iRMK, SupportNET May 26, 1987 Intel Corporation makes no warranty for the use of its products and assumes no responsibility for any errors which may appear in this document nor does it make a commitment to update the information contained herein. Intel retains the right to make changes to these specifications at any time, without notice. Contact your local sales office to obtain the latest specifications before placing your order. The following are trademarks of Intel Corporation and may only be used to identify Intel Products: Above, BITBUS, COMMputer, CREDIT, Data Pipeline, FASTPATH, Genius, i, î, ICE, iCEL, iCS, iDBP, iDIS, I²ICE, iLBX, im, iMDDX, iMMX, Inboard, Insite, Intel, intel, intelBOS, Intel Certified, Intelevision, inteligent Identifier, inteligent Programming, Intellec, Intellink, iOSP, iPDS, iPSC, iRMK, iRMX, iSBC, iSBX, iSDM, iSXM, KEPROM, Library Manager, MAPNET, MCS, Megachassis, MICROMAINFRAME, MULTIBUS, MULTICHANNEL, MULTIMODULE, MultiSERVER, ONCE, OpenNET, OTP, PC BUBBLE, Plug-A-Bubble, PROMPT, Promware, QUEST, QueX, Quick-Pulse Programming, Ripplemode, RMX/80, RUPI, Seamless, SLD, SugarCube, SupportNET, UPI, and VLSiCEL, and the combination of ICE, iCS, iRMX, iSBC, iSBX, iSXM, MCS, or UPI and a numerical suffix, 4-SITE. MDS is an ordering code only and is not used as a product name or trademark. MDS(R) is a registered trademark of Mohawk Data Sciences Corporation. *MULTIBUS is a patented Intel bus. Unix is a trademark of AT&T Bell Labs. MS-DOS, XENIX, and Multiplan are trademarks of Microsoft Corporation. Lotus and 1-2-3 are registered trademarks of Lotus Development Corporation. SuperCalc is a registered trademark of Computer Associates International. Framework is a trademark of Ashton-Tate. System 370 is a trademark of IBM Corporation. AT is a registered trademark of IBM Corporation. Additional copies of this manual or other Intel literature may be obtained from: Intel Corporation Literature Distribution Mail Stop SC6-59 3065 Bowers Avenue Santa Clara, CA 95051 (c)INTEL CORPORATION 1987 CG-5/26/87 Customer Support ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Customer Support is Intel's complete support service that provides Intel customers with hardware support, software support, customer training, and consulting services. For more information contact your local sales offices. After a customer purchases any system hardware or software product, service and support become major factors in determining whether that product will continue to meet a customer's expectations. Such support requires an international support organization and a breadth of programs to meet a variety of customer needs. As you might expect, Intel's customer support is quite extensive. It includes factory repair services and worldwide field service offices providing hardware repair services, software support services, customer training classes, and consulting services. Hardware Support Services Intel is committed to providing an international service support package through a wide variety of service offerings available from Intel Hardware Support. Software Support Services Intel's software support consists of two levels of contracts. Standard support includes TIPS (Technical Information Phone Service), updates and subscription service (product-specific troubleshooting guides and COMMENTS Magazine). Basic support includes updates and the subscription service. Contracts are sold in environments which represent product groupings (i.e., iRMX environment). Consulting Services Intel provides field systems engineering services for any phase of your development or support effort. You can use our systems engineers in a variety of ways ranging from assistance in using a new product, developing an application, personalizing training, and customizing or tailoring an Intel product to providing technical and management consulting. Systems Engineers are well versed in technical areas such as microcommunications, real-time applications, embedded microcontrollers, and network services. You know your application needs; we know our products. Working together we can help you get a successful product to market in the least possible time. Customer Training Intel offers a wide range of instructional programs covering various aspects of system design and implementation. In just three to ten days a limited number of individuals learn more in a single workshop than in weeks of self-study. For optimum convenience, workshops are scheduled regularly at Training Centers woridwide or we can take our workshops to you for on-site instruction. Covering a wide variety of topics, Intel's major course categories include: architecture and assembly language, programming and operating systems, bitbus and LAN applications. Training Center Locations To obtain a complete catalog of our workshops, call the nearest Training Center in your area. Boston (617) 692-1000 Chicago (312) 310-5700 San Francisco (415) 940-7800 Washington D.C. (301) 474-2878 Isreal (972) 349-491-099 Tokyo 03-437-6611 Osaka (Call Tokyo) 03-437-6611 Toronto, Canada (416) 675-2105 London (0793) 696-000 Munich (089) 5389-1 Paris (01) 687-22-21 Stockholm (468) 734-01-00 Milan 39-2-82-44-071 Benelux (Rotterdam) (10) 21-23-77 Copenhagen (1) 198-033 Hong Kong 5-215311-7 Preface ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ This manual describes the 80387 Numeric Processor Extension (NPX) for the 80386 microprocessor. Understanding the 80387 requires an understanding of the 80386; therefore, a brief overview of 80386 concepts is presented first. A detailed discussion of the 80386 microprocessor can be found in the 80386 Programmer's Reference Manual. The 80386 Microsystem The 80386 is the basis of a new VLSI microprocessor system with exceptional capabilities for supporting large-system applications. This powerful microsystem is designed to support multiuser reprogrammable and real-time multitasking applications. Its dedicated system support circuits simplify system hardware; sophisticated hardware and software tools reduce both the time and the cost of product development. The 80386 microsystem offers a total-solution approach, enabling you to develop high-speed, interactive, multiuser, multitasking‘‘even multiprocessor‘‘systems more rapidly and at higher performance than ever before. Ž Reliability and system up-time are becoming increasingly important in all applications. Information must be protected from misuse or accidental loss. The 80386 includes a sophisticated and flexible four-level protection mechanism that can isolate layers of operating system programs from application programs to maintain a high degree of system integrity. Ž The 80386 addresses up to 4 gigabytes of physical memory to support today's application requirements. This large physical memory enables the 80386 to keep many large programs and data structures simultaneously in memory for high-speed access. Ž For applications with dynamically changing memory requirements, such as multiuser business systems, the 80386 CPU provides on-chip memory management and virtual memory support. On an 80386-based system, each user can have up to 64 terabytes of virtual-address space. This large address space virtually eliminates restrictions on the size of programs that may be part of the system. The memory management features are subject to control of systems software; therefore, systems software designers can choose among a variety of memory-organization models. Systems designers can choose to view memory in terms of fixed-length pages, in terms of variable length segments, or as a combination of pages and segments. The sizes of segments can range from one byte to 4 gigabytes. Virtual memory can be implemented either at the level of segments or at the level of pages. Ž Large multiuser or real-time multitasking systems are easily supported by the 80386. High-performance features, such as a very high-speed task switch, fast interrupt-response time, intertask protection, page-oriented virtual memory, and a quick and direct operating system interface, make the 80386 highly suited to multiuser/multitasking applications. Ž The 80386 has two primary operating modes: real-address mode and protected mode. In real-address mode, the 80386/80387 is fully upward compatible from the 8086, 8088, 80186, and 80188 microprocessors and from the 80286 real-address mode; all of the extensive libraries of 8086 and 8088 software execute 15 to 20 times faster on the 80386, without any modification. Ž In protected-address mode, the advanced memory management and protection features of the 80386 become available, without any reduction in performance. Upgrading 8086 and 8088 application programs to use these new memory management and protection features usually requires only reassembly or recompilation (some programs may require minor modification). Entire 80286 protected-mode applications can run in this mode without modification. Ž The virtual-8086 mode of the 80386 is available when the primary mode is protected mode. Virtual-8086 mode enables direct execution of multiple 8086/8088 programs within a protected-mode environment. Most 8086 and 8088 application programs can be executed in this environment without alteration (refer to the 80386 Programmer's Reference Manual for differences from 8086). This high degree of compatibility between 80386 and earlier members of the 8086 processor family reduces both the time and the cost of software development. The Organization of This Manual This manual describes the 80387 Numeric Processor Extension (NPX) for the 80386 microprocessor. The material in this manual is presented from the perspective of software designers, both at an applications and at a systems software level. Ž Chapter 1, "Introduction to the 80387 Numerics Processor Extension," gives an overview of the 80387 NPX and reviews the concepts of numeric computation using the 80387. Ž Chapter 2, "80387 Numerics Processor Architecture," presents the registers and data types of the 80387 to both applications and systems programmers. Ž Chapter 3, "Special Computational Situations," discusses the special values that can be represented in the 80387's real formats‘‘denormal numbers, zeros, infinities, NaNs (not a number)‘‘as well as numerics exceptions. This chapter should be read thoroughly by systems programmers, but may be skimmed by applications programmers. Many of these special values and exceptions may never occur in applications programs. Ž Chapter 4, "80387 Instruction Set," provides functional information for software designers generating applications for systems containing an 80386 CPU with an 80387 NPX. The 80386/80387 instruction set mnemonics are explained in detail. Ž Chapter 5, "Programming Numeric Applications," provides a description of programming facilities for 80386/80387 systems. A comparative 80387 programming example is given. Ž Chapter 6, "System-Level Numeric Programming," provides information of interest to systems software writers, including details of the 80387 architecture and operational characteristics. Ž Chapter 7, "Numeric Programming Examples," provides several detailed programming examples for the 80387, including conditional branching, the conversion betweenfloating-point values and their ASCII representations, and the use of trigonometric functions. These examples illustrate assembly-language programming on the 80387 NPX. Ž Appendix A, "Machine Instruction Encoding and Decoding," gives reference information on the encoding of NPX instructions. This information is useful to writers of debuggers, exception handlers, and compilers. Ž Appendix B, "Exception Summary," provides a list of the exceptions that each instruction can cause. This list is valuable to both applications and systems programmers. Ž Appendix C, "Compatability between the 80387 and the 80287/8087," describes the differences from the 80387 that are common to the 80287 and the 8087. Ž Appendix D, "Compatability between the 80387 and the 8087," describes the additional differences between the 80387 and the 8087 that are of concern when porting 8086/8087 programs directly to the 80386/80387. Ž Appendix E Please consult the most recent 80387 data sheet for these specifications, "80387 80-Bit CHMOS III Numeric Processor Extension," reproduces a data sheet of 80387 specifications that is separately available. The table of instruction timings in this appendix will be of interest to many readers of this manual. (The AC specifications have been deliberately left out.) The specifications in data sheets are subject to change; consult the most recent data sheet for design-in information. Ž Appendix F, "PC/AT-Compatible 80387 Connection," documents a nonstandard method of connecting an 80387 to an 80386 to achieve compatibility with the IBM PC/AT. Ž The Glossary defines 80387 and floating-point terminology. Refer to it as needed. Related Publications To best use the material in this manual, readers should be familiar with the operation and architecture of 80386 systems. The following manuals contain information related to the content of this manual and of interest to programmers of 80387 systems: Ž Introduction to the 80386, order number 231252 Ž 80386 Data Sheet, order number 231630 Ž 80386 Hardware Reference Manual, order number 231732 Ž 80386 Programmer's Reference Manual, order number 230985 Ž 80387 Data Sheet, order number 231920 Notational Conventions This manual uses special notation to represent sub and superscript characters. Subscript characters are surrounded by {curly brackets}, for example 10{2} = 10 base 2. Superscript characters are preceeded by a caret and enclosed within (parentheses), for example 10^(3) = 10 to the third power. Table of Contents ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Chapter 1 Introduction to the 80387 Numerics Processor Extension 1.1 History 1.2 Performance 1.3 Ease of Use 1.4 Applications 1.5 Upgradability 1.6 Programming Interface Chapter 2 80387 Numerics Processor Architecture 2.1 80387 Registers 2.1.1 The NPX Register Stack 2.1.2 The NPX Status Word 2.1.3 Control Word 2.1.4 The NPX Tag Word 2.1.5 The NPX Instruction and Data Pointers 2.2 Computation Fundamentals 2.2.1 Number System 2.2.2 Data Types and Formats 2.2.2.1 Binary Integers 2.2.2.2 Decimal Integers 2.2.2.3 Real Numbers 2.2.3 Rounding Control 2.2.4 Precision Control Chapter 3 Special Computational Situations 3.1 Special Numeric Values 3.1.1 Denormal Real Numbers 3.1.1.1 Denormals and Gradual Underflow 3.1.2 Zeros 3.1.3 Infinity 3.1.4 NaN (Not-a-Number) 3.1.4.1 Signaling NaNs 3.1.4.2 Quiet NaNs 3.1.5 Indefinite 3.1.6 Encoding of Data Types 3.1.7 Unsupported Formats 3.2 Numeric Exceptions 3.2.1 Handling Numeric Exceptions 3.2.1.1 Automatic Exception Handling 3.2.1.2 Software Exception Handling 3.2.2 Invalid Operation 3.2.2.1 Stack Exception 3.2.2.2 Invalid Arithmetic Operation 3.2.3 Division by Zero 3.2.4 Denormal Operand 3.2.5 Numeric Overflow and Underflow 3.2.5.1 Overflow 3.2.5.2 Underflow 3.2.6 Inexact (Precision) 3.2.7 Exception Priority 3.2.8 Standard Underflow/Overflow Exception Handler Chapter 4 The 80387 Instruction Set 4.1 Compatibility with the 80287 and 8087 4.2 Numeric Operands 4.3 Data Transfer Instructions 4.3.1 FLD source 4.3.2 FST destination 4.3.3 FSTP destination 4.3.4 FXCH//destination 4.3.5 FILD source 4.3.6 FIST destination 4.3.7 FISTP destination 4.3.8 FBLD source 4.3.9 FBSTP destination 4.4 Nontranscendental Instructions 4.4.1 Addition 4.4.2 Normal Subtraction 4.4.3 Reversed Subtraction 4.4.4 Multiplication 4.4.5 Normal Division 4.4.6 Reversed Division 4.4.7 FSQRT 4.4.8 FSCALE 4.4.9 FPREM---Partial Remainder (80287/8087-Compatible) 4.4.10 FPREM1---Partial Remainder (IEEE Std. 754-Compatible) 4.4.11 FRNDINT 4.4.12 FXTRACT 4.4.13 FABS 4.4.14 FCHS 4.5 Comparison Instructions 4.5.1 FCOM//source 4.5.2 FCOMP//source 4.5.3 FCOMPP 4.5.4 FICOM source 4.5.5 FICOMP source 4.5.6 FTST 4.5.7 FUCOM//source 4.5.8 FUCOMP//source 4.5.9 FUCOMPP 4.5.10 FXAM 4.6 Transcendental Instructions 4.6.1 FCOS 4.6.2 FSIN 4.6.3 FSINCOS 4.6.4 FPTAN 4.6.5 FPATAN 4.6.6 F2XM1 4.6.7 FYL2X 4.6.8 FYL2XP1 4.7 Constant Instructions 4.7.1 FLDZ 4.7.2 FLD1 4.7.3 FLDPI 4.7.4 FLDL2T 4.7.5 FLDL2E 4.7.6 FLDLG2 4.7.7 FLDLN2 4.8 Processor Control Instructions 4.8.1 FINIT/FNINIT 4.8.2 FLDCW source 4.8.3 FSTCW/FNSTCW destination 4.8.4 FSTSW/FNSTSW destination 4.8.5 FSTSW AX/FNSTSW AX 4.8.6 FCLEX/FNCLEX 4.8.7 FSAVE/FNSAVE destination 4.8.8 FRSTOR source 4.8.9 FSTENV/FNSTENV destination 4.8.10 FLDENV source 4.8.11 FINCSTP 4.8.12 FDECSTP 4.8.13 FFREE destination 4.8.14 FNOP 4.8.15 FWAIT (CPU Instruction) Chapter 5 Programming Numeric Applications 5.1 Programming Facilities 5.1.1 High-Level Languages 5.1.2 C Programs 5.1.3 PL/M-386 5.1.4 ASM386 5.1.4.1 Defining Data 5.1.4.2 Records and Structures 5.1.4.3 Addressing Methods 5.1.5 Comparative Programming Example 5.1.6 80387 Emulation 5.2 Concurrent Processing with the 80387 5.2.1 Managing Concurrency 5.2.1.1 Incorrect Exception Synchronization 5.2.1.2 Proper Exception Synchronization Chapter 6 System-Level Numeric Programming 6.1 80386/80387 Architecture 6.1.1 Instruction and Operand Transfer 6.1.2 Independent of CPU Addressing Modes 6.1.3 Dedicated I/O Locations 6.2 Processor Initialization and Control 6.2.1 System Initialization 6.2.2 Hardware Recognition of the NPX 6.2.3 Software Recognition of the NPX 6.2.4 Configuring the Numerics Environment 6.2.5 Initializing the 80387 6.2.6 80387 Emulation 6.2.7 Handling Numerics Exceptions 6.2.8 Simultaneous Exception Response 6.2.9 Exception Recovery Examples Chapter 7 Numeric Programming Examples 7.1 Conditional Branching Example 7.2 Exception Handling Examples 7.3 Floating-Point to ASCII Conversion Examples 7.3.1 Function Partitioning 7.3.2 Exception Considerations 7.3.3 Special Instructions 7.3.4 Description of Operation 7.3.5 Scaling the Value 7.3.5.1 Inaccuracy in Scaling 7.3.5.2 Avoiding Underflow and Overflow 7.3.5.3 Final Adjustments 7.3.6 Output Format 7.4 Trigonometric Calculation Examples (Not Tested) Appendix A Machine Instruction Encoding and Decoding Appendix B Exception Summary Appendix C Compatibility Between the 80387 and the 80287/8087 Appendix D Compatibility Between the 80387 and the 8087 Appendix E 80387 80-Bit CHMOS III Numeric Processor Extension Appendix F PC/AT-Compatible 80387 Connection Glossary of 80387 and Floating-Point Terminology Figures 1-1 Evolution and Performance of Numeric Processors 2-1 80387 Register Set 2-2 80387 Status Word 2-3 80387 Control Word Format 2-4 80387 Tag Word Format 2-5 Protected Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format 2-6 Real Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format 2-7 Protected Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format 2-8 Real Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format 2-9 80387 Double-Precision Number System 2-10 80387 Data Formats 3-1 Floating-Point System with Denormals 3-2 Floating-Point System without Denormals 3-3 Arithmetic Example Using Infinity 4-1 FSAVE/FRSTOR Memory Layout (32-Bit) 4-2 FSAVE/FRSTOR Memory Layout (16-Bit) 4-3 Protected Mode 80387 Environment, 32-Bit Format 4-4 Real Mode 80387 Environment, 32-Bit Format 4-5 Protected Mode 80387 Environment, 16-Bit Format 4-6 Real Mode 80387 Environment, 16-Bit Format 5-1 Sample C-386 Program 5-2 Sample 80387 Constants 5-3 Status Word Record Definition 5-4 Structure Definition 5-5 Sample PL/M-386 Program 5-6 Sample ASM386 Program 5-7 Instructions and Register Stack 5-8 Exception Synchronization Examples 6-1 Software Routine to Recognize the 80287 7-1 Conditional Branching for Compares 7-2 Conditional Branching for FXAM 7-3 Full-State Exception Handler 7-4 Reduced-Latency Exception Handler 7-5 Reentrant Exception Handler 7-6 Floating-Point to ASCII Conversion Routine 7-7 See page 7-22 in the printed version of this manual Relationships between Adjacent Joints 7-8 Robot Arm Kinematics Example Tables 1-1 Numeric Processing Speed Comparisons 1-2 Numeric Data Types 1-3 Principal NPX Instructions 2-1 Condition Code Interpretation 2-2 Correspondence between 80387 and 80386 Flag Bits 2-3 Summary of Format Parameters 2-4 Real Number Notation 2-5 Rounding Modes 3-1 Arithmetic and Nonarithmetic Instructions 3-2 Denormalization Process 3-3 Zero Operands and Results 3-4 Infinity Operands and Results 3-5 Rules for Generating QNaNs 3-6 Binary Integer Encodings 3-7 Packed Decimal Encodings 3-8 Single and Double Real Encodings 3-9 Extended Real Encodings 3-10 Masked Responses to Invalid Operations 3-11 Masked Overflow Results 4-1 Data Transfer Instructions 4-2 Nontranscendental Instructions 4-3 Basic Nontranscendental Instructions and Operands 4-4 Condition Code Interpretation after FPREM and FPREM Instructions 4-5 Comparison Instructions 4-6 Condition Code Resulting from Comparisons 4-7 Condition Code Resulting from FTST 4-8 Condition Code Defining Operand Class 4-9 Transcendental Instructions 4-10 Results of FPATAN 4-11 Constant Instructions 4-12 Processor Control Instructions 5-1 PL/M-386 Built-In Procedures 5-2 ASM386 Storage Allocation Directives 5-3 Addressing Method Examples 6-1 NPX Processor State Following Initialization Chapter 1 Introduction to the 80387 Numerics Processor Extension ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ The 80387 NPX is a high-performance numerics processing element that extends the 80386 architecture by adding significant numeric capabilities and direct support for floating-point, extended-integer, and BCD data types. The 80386 CPU with 80387 NPX easily supports powerful and accurate numeric applications through its implementation of the IEEE Standard 754 for Binary Floating-Point Arithmetic. The 80387 provides floating-point performance comparable to that of large minicomputers while offering compatibility with object code for 8087 and 80287. 1.1 History The 80387 Numeric Processor Extension (NPX) is compatible with its predecessors, the earlier Intel 8087 NPX and 80287 NPX. As the 80386 runs 8086 programs, so programs designed to use the 8087 and 80287 should run unchanged on the 80387. The 8087 NPX was designed for use in 8086-family systems. The 8086 was the first microprocessor family to partition the processing unit to permit high-performance numeric capabilities. The 8087 NPX for this processor family implemented a complete numeric processing environment in compliance with an early proposal for the IEEE 754 Floating-Point Standard. With the 80287 Numeric Processor Extension, high-speed numeric computations were extended to 80286 high-performance multitasking and multiuser systems. Multiple tasks using the numeric processor extension were afforded the full protection of the 80286 memory management and protection features. The 80387 Numeric Processor Extension is Intel's third generation numerics processor. The 80387 implements the final IEEE standard, adds new trigonometric instructions, and uses a new design and CHMOS-III process to allow higher clock rates and require fewer clocks per instruction. Together, the 80387 with additional instructions and the improved standard bring even more convenience and reliability to numerics programming and make this convenience and reliability available to applications that need the high-speed and large memory capacity of the 32-bit environment of the 80386 CPU. Figure 1-1 illustrates the relative performance of 5-MHz 8086/8087, 8-MHz 80286/80287, and 20-MHz 80386/80387 systems in executing numerics-oriented applications. Figure 1-1. Evolution and Performance of Numeric Processors 16 80386/80387 (20 MHz) 15 14 13 12 11 RELATIVE 10 PERFORMANCE 9 8 7 6 5 4 3 80286/80287 (8 MHz) 2 1 8086/8087 (5 MHz) ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 1980 1983 1987 YEAR INTRODUCED 1.2 Performance Table 1-1 compares the execution times of several 80387 instructions with the equivalent operations executed on an 8-MHz 80287. As indicated in the table, the 16-MHz 80387 NPX provides about 5 to 6 times the performance of an 8-MHz 80287 NPX. A 16-MHz 80387 multiplies 32-bit and 64-bit floating-point numbers in about 1.9 and 2.8 microseconds, respectively. Of course, the actual performance of the NPX in a given system depends on the characteristics of the individual application. Although the performance figures shown in Table 1-1 refer to operations on real (floating-point) numbers, the 80387 also manipulates fixed-point binary and decimal integers of up to 64 bits or 18 digits, respectively. The 80387 can improve the speed of multiple-precision software algorithms for integer operations by 10 to 100 times. Because the 80387 NPX is an extension of the 80386 CPU, no software overhead is incurred in setting up the NPX for computation. The 80387 and 80386 processors coordinate their activities in a manner transparent to software. Moreover, built-in coordination facilities allow the 80386 CPU to proceed with other instructions while the 80387 NPX is simultaneously executing numeric instructions. Programs can exploit this concurrency of execution to further increase system performance and throughput. Table 1-1. Numeric Processing Speed Comparisons Approximate Performance Ratios: Floating-Point Instruction 16 MHz 80386/80387 ÷ ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ 8 MHz 80286/80287 FADD ST, ST(i) Addition 6.2 FDIV dword_var Division 4.7 FYL2X stack (0), (1) assumed Logarithm 6.0 FPATAX stack (0) assumed Arctangent 2.6 The ratio is higher if the operand is not in range of the 80287 instruction. F2XM1 stack (0) assumed Exponentiation 2.7 The ratio is higher if the operand is not in range of the 80287 instruction. 1.3 East of Use The 80387 NPX offers more than raw execution speed for computation-intensive tasks. The 80387 brings the functionality and power of accurate numeric computation into the hands of the general user. These features are available in most high-level languages available for the 80386. Like the 8087 and 80287 that preceded it, the 80387 is explicitly designed to deliver stable, accurate results when programmed using straightforward "pencil and paper" algorithms. The IEEE standard 754 specifically addresses this issue, recognizing the fundamental importance of making numeric computations both easy and safe to use. For example, most computers can overflow when two single-precision floating-point numbers are multiplied together and then divided by a third, even if the final result is a perfectly valid 32-bit number. The 80387 delivers the correctly rounded result. Other typical examples of undesirable machine behavior in straightforward calculations occur when computing financial rate of return, which involves the expression (1 + i)^(n) or when solving for roots of a quadratic equation: -b ± ¹(b² - 4ac) ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 2a If a does not equal 0, the formula is numerically unstable when the roots are nearly coincident or when their magnitudes are wildly different. The formula is also vulnerable to spurious over/underflows when the coefficients a, b, and c are all very big or all very tiny. When single-precision (4-byte) floating-point coefficients are given as data and the formula is evaluated in the 80387's normal way, keeping all intermediate results in its stack, the 80387 produces impeccable single-precision roots. This happens because, by default and with no effort on the programmer's part, the 80387 evaluates all those subexpressions with so much extra precision and range as to overwhelm any threat to numerical integrity. If double-precision data and results were at issue, a better formula would have to be used, and once again the 80387's default evaluation of that formula would provide substantially enhanced numerical integrity over mere double-precision evaluation. On most machines, straightforward algorithms will not deliver consistently correct results (and will not indicate when they are incorrect). To obtain correct results on traditional machines under all conditions usually requires sophisticated numerical techniques that are foreign to most programmers. General application programmers using straightforward algorithms will produce much more reliable programs using the 80387. This simple fact greatly reduces the software investment required to develop safe, accurate computation-based products. Beyond traditional numerics support for scientific applications, the 80387 has built-in facilities for commercial computing. It can process decimal numbers of up to 18 digits without round-off errors, performing exact arithmetic on integers as large as 2^(64) or 10^(18). Exact arithmetic is vital in accounting applications where rounding errors may introduce monetary losses that cannot be reconciled. The NPX contains a number of optional facilities that can be invoked by sophisticated users. These advanced features include directed rounding, gradual underflow, and programmed exception-handling facilities. These automatic exception-handling facilities permit a high degree of flexibility in numeric processing software, without burdening the programmer. While performing numeric calculations, the NPX automatically detects exception conditions that can potentially damage a calculation (for example, X ÷ 0 or ¹X when X < 0). By default, on-chip exception logic handles these exceptions so that a reasonable result is produced and execution may proceed without program interruption. Alternatively, the NPX can signal the CPU, invoking a software exception handler to provide special results whenever various types of exceptions are detected. 1.4 Applications The 80386's versatility and performance make it appropriate to a broad array of numeric applications. In general, applications that exhibit any of the following characteristics can benefit by implementing numeric processing on the 80387: Ž Numeric data vary over a wide range of values, or include nonintegral values. Ž Algorithms produce very large or very small intermediate results. Ž Computations must be very precise; i.e., a large number of significant digits must be maintained. Ž Performance requirements exceed the capacity of traditional microprocessors. Ž Consistently safe, reliable results must be delivered using a programming staff that is not expert in numerical techniques. Note also that the 80387 can reduce software development costs and improve the performance of systems that use not only real numbers, but operate on multiprecision binary or decimal integer values as well. A few examples, which show how the 80387 might be used in specific numerics applications, are described below. In many cases, these types of systems have been implemented in the past with minicomputers or small mainframe computers. The advent of the 80387 brings the size and cost savings of microprocessor technology to these applications for the first time. Ž Business data processing‘‘The NPX's ability to accept decimal operands and produce exact decimal results of up to 18 digits greatly simplifies accounting programming. Financial calculations that use power functions can take advantage of the 80387's exponentiation and logarithmic instructions. Many business software packages can benefit from the speed and accuracy of the 80387; for example, Lotus* 1-2-3*, Multiplan*, SuperCalc*, and Framework*. Ž Simulation‘‘The large (32-bit) memory space of the 80386 coupled with the raw speed of the 80386 and 80387 processors make 80386/80387 microsystems suitable for attacking large simulation problems, which heretofore could only be executed on expensive mini and mainframe computers. For example, complex electronic circuit simulations using SPICE can now be performed on a microcomputer, the 80386/80387. Simulation of mechanical systems using finite element analysis can employ more elements, resulting in more detailed analysis or simulation of larger systems. Ž Graphics transformations‘‘The 80387 can be used in graphics terminals to locally perform many functions that normally demand the attention of a main computer; these include rotation, scaling, and interpolation. By also using an 82786 Graphics Display Controller to perform high-speed drawing and window management, very powerful and highly self-sufficient terminals can be built from a relatively small number of 80386 family parts. Ž Process control‘‘The 80387 solves dynamic range problems automatically, and its extended precision allows control functions to be fine-tuned for more accurate and efficient performance. Control algorithms implemented with the NPX also contribute to improved reliability and safety, while the 80387's speed can be exploited in real-time operations. Ž Computer numerical control (CNC)‘‘The 80387 can move and position machine tool heads with accuracy in real-time. Axis positioning also benefits from the hardware trigonometric support provided by the 80387. Ž Robotics‘‘Coupling small size and modest power requirements with powerful computational abilities, the 80387 is ideal for on-board six-axis positioning. Ž Navigation‘‘Very small, lightweight, and accurate inertial guidance systems can be implemented with the 80387. Its built-in trigonometric functions can speed and simplify the calculation of position from bearing data. Ž Data acquisition‘‘The 80387 can be used to scan, scale, and reduce large quantities of data as it is collected, thereby lowering storage requirements and time required to process the data for analysis. The preceding examples are oriented toward traditional numerics applications. There are, in addition, many other types of systems that do not appear to the end user as computational, but can employ the 80387 to advantage. Indeed, the 80387 presents the imaginative system designer with an opportunity similar to that created by the introduction of the microprocessor itself. Many applications can be viewed as numerically-based if sufficient computational power is available to support this view (e.g., character generation for a laser printer). This is analogous to the thousands of successful products that have been built around "buried" microprocessors, even though the products themselves bear little resemblance to computers. 1.5 Upgradability The architecture of the 80386 CPU is specifically adapted to allow easy upgradability to use an 80387, simply by plugging in the 80387 NPX. For this reason, designers of 80386 systems may wish to incorporate the 80387 NPX into their designs in order to offer two levels of price and performance at little additional cost. Two features of the 80386 CPU make the design and support of upgradable 80386 systems particularly simple: Ž The 80386 can be programmed to recognize the presence of an 80387 NPX; that is, software can recognize whether it is running on an 80386 with or without an 80387 NPX. Ž After determining whether the 80387 NPX is available, the 80386 CPU can be instructed to let the NPX execute all numeric instructions. If an 80387 NPX is not available, the 80386 CPU can emulate all 80387 numeric instructions in software. This emulation is completely transparent to the application software‘‘the same object code may be used by 80386 systems both with and without an 80387 NPX. No relinking or recompiling of application software is necessary; the same code will simply execute faster with the 80387 NPX than without. To facilitate this design of upgradable 80386 systems, Intel provides a software emulator for the 80387 that provides the functional equivalent of the 80387 hardware, implemented in software on the 80386. Except for timing, the operation of this 80387 emulator (EMUL387) is the same as for the 80387 NPX hardware. When the emulator is combined as part of the systems software, the 80386 system with 80387 emulation and the 80386 with 80387 hardware are virtually indistinguishable to an application program. This capability makes it easy for software developers to maintain a single set of programs for both systems. System manufacturers can offer the NPX as a simple plug-in performance option without necessitating any changes in the user's software. 1.6 Programming Interface The 80386/80387 pair is programmed as a single processor; all of the 80387 registers appear to a programmer as extensions of the basic 80386 register set. The 80386 has a class of instructions known as ESCAPE instructions, all having a common format. These ESC instructions are numeric instructions for the 80387 NPX. These numeric instructions for the 80387 are simply encoded into the instruction stream along with 80386 instructions. All of the CPU memory-addressing modes may be used in programming the NPX, allowing convenient access to record structures, numeric arrays, and other memory-based data structures. All of the memory management and protection features of the CPU (both paging and segmentation) are extended to the NPX as well. Numeric processing in the 80387 centers around the NPX register stack. Programmers can treat these eight 80-bit registers either as a fixed register set, with instructions operating on explicitly-designated registers, or as a classical stack, with instructions operating on the top one or two stack elements. Internally, the 80387 holds all numbers in a uniform 80-bit extended format. Operands that may be represented in memory as 16-, 32-, or 64-bit integers, 32-, 64-, or 80-bit floating-point numbers, or 18-digit packed BCD numbers, are automatically converted into extended format as they are loaded into the NPX registers. Computation results are subsequently converted back into one of these destination data formats when they are stored into memory from the NPX registers. Table 1-2 lists each of the seven data types supported by the 80387, showing the data format for each type. All operands are stored in memory with the least significant digits starting at the initial (lowest) memory address. Numeric instructions access and store memory operands using only this initial address. For maximum system performance, all operands should start at memory addresses divisible by four. Table 1-3 lists the 80387 instructions by class. No special programming tools are necessary to use the 80387, because all of the NPX instructions and data types are directly supported by the ASM386 Assembler, by high-level languages from Intel, and by assemblers and compilers produced by many independent software vendors. Software routines for the 80387 may be written in ASM386 Assembler or any of the following higher-level languages from Intel: PL/M-386 C-386 In addition, all of the development tools supporting the 8086/8087 and 80286/80287 can also be used to develop software for the 80386/80387. All of these high-level languages provide programmers with access to the computational power and speed of the 80387 without requiring an understanding of the architecture of the 80386 and 80387 chips. Such architectural considerations as concurrency and synchronization are handled automatically by these high-level languages. For the ASM386 programmer, specific rules for handling these issues are discussed in a later section of this manual. The following operating systems are known or expected to support the 80387: RMX-286/386, MS-DOS, Xenix-286/386, and Unix-286/386. Advanced in-circuit debugging support is provided by ICE-386. Table 1-2. Numeric Data Types Data Type Bits Significant Approximate Range (Decimal) Digits (Decimal) Word integer 16 4 -32,768 ¾ X ¾ +32,767 Short integer 32 9 -2*10^(9) ¾ X ¾ +2*10^(9) Long integer 64 18 -9*10^(18) ¾ X ¾ +9*10^(18) Packed decimal 80 18 -99...99 ¾ X ¾ +99...99 (18 digits) Single real 32 6-7 1.18*10^(-38) ¾ X ¾ 3.40*10^(38) Double real 64 15-16 2.23*10^(-308) ¾ X ¾ 1.80*10^(308) Extended real Equivalent to double extended format of IEEE Std 754 80 19 3.30*10^(-4932) ¾ X ¾ 1.21*10^(4932) Table 1-3. Principal NPX Instructions Class Instruction Types Data Transfer Load (all data types), Store (all data types), Exchange Arithmetic Add, Subtract, Multiply, Divide, Subtract Reversed, Divide Reversed, Square Root, Scale, Remainder, Integer Part, Change Sign, Absolute Value, Extract Comparison Compare, Examine, Test Transcendental Tangent, Arctangent, Sine, Cosine, Sine and Cosine, 2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1) Constants 0, 1, Ò, Log{10}2, Log{e}2, Log{2}10, Log{2}e Processor Control Load Control Word, Store Control Word, Store Status Word, Load Environment, Store Environment, Save, Restore, Clear Exceptions, Initialize Class Instruction Types Data Transfer Load (all data types), Store (all data types), Exchange Arithmetic Add, Subtract, Multiply, Divide, Subtract Reversed, Divide Reversed, Square Root, Scale, Remainder, Integer Part, Change Sign, Absolute Value, Extract Comparison Compare, Examine, Test Transcendental Tangent, Arctangent, Sine, Cosine, Sine and Cosine, 2^(x) - 1, Y * Log{2}(X), Y * Log{2}(X+1) Constants 0, 1, Ò, Log{10}2, Log{e}2, Log{2}10, Log{2}e Processor Control Load Control Word, Store Control Word, Store Status Word, Load Environment, Store Environment, Save, Restore, Clear Exceptions, Initialize Chapter 2 80387 Numerics Processor Architecture ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ To the programmer, the 80387 NPX appears as a set of additional registers, data types, and instructions‘‘all of which complement those of the 80386. Refer to Chapter 4 for detailed explanations of the 80387 instruction set. This chapter explains the new registers and data types that the 80387 brings to the architecture of the 80386. 2.1 80387 Registers The additional registers consist of Ž Eight individually-addressable 80-bit numeric registers, organized as a register stack Ž Three sixteen-bit registers containing: the NPX status word the NPX control word the tag word Ž Two 48-bit registers containing pointers to the current instruction and operand (these registers are actually located in the 80386) All of the NPX numeric instructions focus on the contents of these NPX registers. 2.1.1 The NPX Register Stack The 80387 register stack is shown in Figure 2-1. Each of the eight numeric registers in the 80387's register stack is 80 bits wide and is divided into fields corresponding to the NPX's extended real data type. Numeric instructions address the data registers relative to the register on the top of the stack. At any point in time, this top-of-stack register is indicated by the TOP (stack TOP) field in the NPX status word. Load or push operations decrement TOP by one and load a value into the new top register. A store-and-pop operation stores the value from the current TOP register and then increments TOP by one. Like 80386 stacks in memory, the 80387 register stack grows down toward lower-addressed registers. Many numeric instructions have several addressing modes that permit the programmer to implicitly operate on the top of the stack, or to explicitly operate on specific registers relative to the TOP. The ASM386 Assembler supports these register addressing modes, using the expression ST(0), or simply ST, to represent the current Stack Top and ST(i) to specify the ith register from TOP in the stack (0 ¾ i ¾ 7). For example, if TOP contains 011B (register 3 is the top of the stack), the following statement would add the contents of two registers in the stack (registers 3 and 5): FADD ST, ST(2) The stack organization and top-relative addressing of the numeric registers simplify subroutine programming by allowing routines to pass parameters on the register stack. By using the stack to pass parameters rather than using "dedicated" registers, calling routines gain more flexibility in how they use the stack. As long as the stack is not full, each routine simply loads the parameters onto the stack before calling a particular subroutine to perform a numeric calculation. The subroutine then addresses its parameters as ST, ST(1), etc., even though TOP may, for example, refer to physical register 3 in one invocation and physical register 5 in another. Figure 2-1. 80387 Register Set 80387 DATA REGISTERS TAG FIELD 79 78 64 63 0 1 0 ‚ÐЃ ‚ƒ R0€SIGNEXPONENT SIGNIFICAND € € € R1Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘ R2Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘ R3Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘ R4Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘ R5Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘ R6Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘ R7Ñ‘‘‘š‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘ „¤¤… „… 15 0 47 0 ‚ƒ ‚ƒ € CONTROL REGISTER € € INSTRUCTION POINTER € Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € STATUS REGISTER € € DATA POINTER € Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ „… € TAG WORD € „… 2.1.2 The NPX Status Word The 16-bit status word shown in Figure 2-2 reflects the overall state of the 80387. This status word may be stored into memory using the FSTSW/FNSTSW, FSTENV/FNSTENV, and FSAVE/FNSAVE instructions, and can be transferred into the 80386 AX register with the FSTSW AX/FNSTSW AX instructions, allowing the NPX status to be inspected by the CPU. The B-bit (bit 15) is included for 8087 compatibility only. It reflects the contents of the ES bit (bit 7 of the status word), not the status of the BUSY# output of the 80387. The four NPX condition code bits (C{3}-C{0}) are similar to the flags in a CPU: the 80387 updates these bits to reflect the outcome of arithmetic operations. The effect of these instructions on the condition code bits is summarized in Table 2-1. These condition code bits are used principally for conditional branching. The FSTSW AX instruction stores the NPX status word directly into the CPU AX register, allowing these condition codes to be inspected efficiently by 80386 code. The 80386 SAHF instruction can copy C{3}-C{0} directly to 80386 flag bits to simplify conditional branching. Table 2-2 shows the mapping of these bits to the 80386 flag bits. Bits 12-14 of the status word point to the 80387 register that is the current Top of Stack (TOP). The significance of the stack top has been described in the prior section on the register stack. Figure 2-2 shows the six exception flags in bits 0-5 of the status word. Bit 7 is the exception summary status (ES) bit. ES is set if any unmasked exception bits are set, and is cleared otherwise. If this bit is set, the ERROR# signal is asserted. Bits 0-5 indicate whether the NPX has detected one of six possible exception conditions since these status bits were last cleared or reset. They are "sticky" bits, and can only be cleared by the instructions FINIT, FCLEX, FLDENV, FSAVE, and FRSTOR. Bit 6 is the stack fault (SF) bit. This bit distinguishes invalid operations due to stack overflow or underflow from other kinds of invalid operations. When SF is set, bit 9 (C{1}) distinguishes between stack overflow (C{1} = 1) and underflow (C{1} = 0). Figure 2-2. 80387 Status Word ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 80387 BUSY ’‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ TOP OF STACK POINTER ’‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ CONDITION CODE         15 7 0 ‚ÐÐÐÐÐÐÐÐÐÐÐÐÐÐЃ € B C TOP C C C E S P U O Z D I € € 3 2 1 0 S F E E E E E E € „¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤…         ERROR SUMMARY STATUS ‘‘‘‘‘‘‘‘‘‘‘‘‘• STACK FAULT ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• EXCEPTION FLAGS PRECISION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• UNDERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• OVERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• ZERO DIVIDE ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• DENORMALIZED OPERAND ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• INVALID OPERATION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ NOTE: ES IS SET IF ANY UNMASKED EXCEPTION BIT IS SET; CLEARED OTHERWISE. SEE TABLE 2-1 FOR INTERPRETATION OF CONDITION CODE. TOP VALUES: 000 = REGISTER 0 IS TOP OF STACK 001 = REGISTER 1 IS TOP OF STACK . . . 111 = REGISTER 7 IS TOP OF STACK FOR DEFINITIONS OF EXCEPTIONS, REFER TO CHAPTER 3. ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Table 2-1. Condition Code Interpretation Instruction C0 (S) C3 (Z) C1 (A) C2 (C) FPREM, FPREM1 Three least significant bits Reduction of quotient Q2 Q0 Q1 0=complete or O/U# 1=incomplete FCOM, FCOMP, FCOMPP, FTST, Result of comparison Zero Operand is not FUCOM, FUCOMP, or O/U# comparable FUCOMPP, FICOM, FICOMP FXAM Operand class Sign Operand class or O/U# FCHS, FABS, FXCH, FINCTOP, FDECTOP, Constant UNDEFINED Zero UNDEFINED loads, FXTRACT, or O/U# FLD, FILD, FBLD, FSTP (ext real) FIST, FBSTP, FRNDINT, FST, FSTP, FADD, FMUL, FDIV, FDIVR, FSUB, UNDEFINED Roundup UNDEFINED FSUBR, FSCALE, or O/U# FSQRT, FPATAN, F2XM1, FYL2X, FYL2XP1 FPTAN, FSIN, UNDEFINED Roundup Reduction FCOS, FSINCOS or O/U# 0=complete undefined 1=incomplete if C2=1 FLDENV, FRSTOR Each bit loaded from memory FLDCW, FSTENV, FSTCW, FSTSW, UNDEFINED FCLEX, FINIT, FSAVE ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ NOTES O/U# When both IE and SF bits of status word are set, indicating a stack exception, this bit distinguishes between stack overflow (C1=1) and underflow (C1=0). Reduction If FPREM and FPREM1 produces a remainder that is less than the modulus, reduction is complete. When reduction is incomplete the value at the top of the stack is a partial remainder, which can be used as input to further reduction. For FPTAN, FSIN, FCOS, and FSINCOS, the reduction bit is set if the operand at the top of the stack is too large. In this case the original operand remains at the top of the stack. Roundup When the PE bit of the status word is set, this bit indicates whether the last rounding in the instruction was upward. UNDEFINED Do not rely on finding any specific value in these bits. ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Table 2-2. Correspondence between 80387 and 80386 Flag Bits 80387 Flag 80386 Flag C{0} CF C{1} (none) C{2} PF C{3} ZF 2.1.3 Control Word The NPX provides the programmer with several processing options, which are selected by loading a word from memory into the control word. Figure 2-3 shows the format and encoding of the fields in the control word. The low-order byte of this control word configures the 80387 exception masking. Bits 0-5 of the control word contain individual masks for each of the six exception conditions recognized by the 80387. The high-order byte of the control word configures the 80387 processing options, including Ž Precision control Ž Rounding control The precision-control bits (bits 8-9) can be used to set the 80387 internal operating precision at less than the default precision (64-bit significand). These control bits can be used to provide compatibility with the earlier-generation arithmetic processors having less precision than the 80387. The precision-control bits affect the results of only the following five arithmetic instructions: ADD, SUB(R), MUL, DIV(R), and SQRT. No other operations are affected by PC. The rounding-control bits (bits 10-11) provide for the common round-to-nearest mode, as well as directed rounding and true chop. Rounding control affects only the arithmetic instructions (refer to Chapter 3 for lists of arithmetic and nonarithmetic instructions). Figure 2-3. 80387 Control Word Format ’‘‘‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘RESERVED ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ (INFINITY CONTROL) This "infinity control" bit is not meaningful to the 80387. To maintain compatibility with the 80287, this bit can be programmed; however, regardless of its value, the 80387 treats infinity in the affine sense (-ý < +ý). ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ ROUNDING CONTROL ’‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ PRECISION CONTROL         15 7 0 ‚ÐÐÐÐÐÐÐÐÐÐÐÐÐÐЃ € X X X X RC PC X X P U O Z D I € € M M M M M M € „¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤…         RESERVED ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘• EXECEPTION MASKS PRECISION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• UNDERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• OVERFLOW ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• ZERO DIVIDE ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• DENORMALIZED OPERAND ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• INVALID OPERATION ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ NOTE: PRECISION CONTROL ROUNDING CONTROL 00--24 BITS (SINGLE PRECISION) 00--ROUND TO NEAREST OR EVEN 01--(RESERVED) 01--ROUND DOWN (TOWARD -ý) 10--53 BITS (DOUBLE PRECISION) 10--ROUND UP (TOWARD +ý) 11--64 BITS (EXTENDED PRECISION) 11--CHOP (TRUNCATE TOWARDS ZERO) ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 2.1.4 The NPX Tag Word The tag word indicates the contents of each register in the register stack, as shown in Figure 2-4. The tag word is used by the NPX itself to distinguish between empty and nonempty register locations. Programmers of exception handlers may use this tag information to check the contents of a numeric register without performing complex decoding of the actual data in the register. The tag values from the tag word correspond to physical registers 0-7. Programmers must use the current top-of-stack (TOP) pointer stored in the NPX status word to associate these tag values with the relative stack registers ST(0) through ST(7). The exact values of the tags are generated during execution of the FSTENV and FSAVE instructions according to the actual contents of the nonempty stack locations. During execution of other instructions, the 80387 updates the TW only to indicate whether a stack location is empty or nonempty. Figure 2-4. 80387 Tag Word Format 15 0 ‚ÐÐÐÐÐÐÐÐÐÐÐÐÐÐЃ € TAG (7) TAG (6) TAG (5) TAG (4) TAG (3) TAG (2) TAG (1) TAG (0)€ „¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤… TAG VALUES: 00 = VALID 01 = ZERO 10 = INVALID OR INFINITY 11 = EMPTY 2.1.5 The NPX Instruction and Data Pointers The instruction and data pointers provide support for programmed exception-handlers. These registers are actually located in the 80386, but appear to be located in the 80387 because they are accessed by the ESC instructions FLDENV, FSTENV, FSAVE, and FRSTOR. Whenever the 80386 decodes an ESC instruction, it saves the instruction address, the operand address (if present), and the instruction opcode. When stored in memory, the instruction and data pointers appear in one of four formats, depending on the operating mode of the 80386 (protected mode or real-address mode) and depending on the operand-size attribute in effect (32-bit operand or 16-bit operand). When the 80386 is in virtual-8086 mode, the real-address mode formats are used. Figures 2-5 through 2-8 show these pointers as they are stored following an FSTENV instruction. The FSTENV and FSAVE instructions store this data into memory, allowing exception handlers to determine the precise nature of any numeric exceptions that may be encountered. The instruction address saved in the 80386 (as in the 80287) points to any prefixes that preceded the instruction. This is different from the 8087, for which the instruction address points only to the ESC instruction opcode. Note that the processor control instructions FINIT, FLDCW, FSTCW, FSTSW, FCLEX, FSTENV, FLDENV, FSAVE, FRSTOR, and FWAIT do not affect the data pointer. Note also that, except for the instructions just mentioned, the value of the data pointer is undefined if the prior ESC instruction did not have a memory operand. Figure 2-5. Protected Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format 32-BIT PROTECTED MODE FORMAT 31 23 15 7 0 ‚ÏÏσ € RESERVED CONTROL WORD €0H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € RESERVED STATUS WORD €4H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € RESERVED TAG WORD €8H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € IP OFFSET €CH Ñ‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € 0 0 0 0 0 OPCODE 10..0 CS SELECTOR €10H Ñ‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € DATA OPERAND OFFSET €14H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € RESERVED OPERAND SELECTOR €18H „ÏÏÏ… Figure 2-6. Real Mode 80387 Instruction and Data Pointer Image in Memory, 32-Bit Format 32-BIT REAL ADDRESS MODE FORMAT 31 23 15 7 0 ‚ÏÏσ € RESERVED CONTROL WORD €0H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € RESERVED STATUS WORD €4H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € RESERVED TAG WORD €8H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € RESERVED INSTRUCTION POINTER 15..0 €CH Ñ‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘˜‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € 0 0 0 0 INSTRUCTION POINTER 31..16 0 OPCODE 10..0 €10H Ñ‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘™‘™‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € RESERVED OPERAND POINTER €14H Ñ‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € 0 0 0 0 OPERAND POINTER 31..16 0 0 0 0 0 0 0 0 0 0 0 0€18H „¤ÏϤυ Figure 2-7. Protected Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format 16-BIT PROTECTED MODE FORMAT 15 7 0 ‚σ € CONTROL WORD € 0H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € STATUS WORD € 2H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € TAG WORD € 4H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € IP OFFSET € 6H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € CB SELECTOR € 8H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € OPERAND OFFSET € AH Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € OPERAND SELECTOR € CH „Ï… Figure 2-8. Real Mode 80387 Instruction and Data Pointer Image in Memory, 16-Bit Format 16-BIT REAL-ADDRESS MODE AND VIRTUAL-8086 MODE FORMAT 15 7 0 ‚σ € CONTROL WORD € 0H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € STATUS WORD € 2H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € TAG WORD € 4H Ñ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € INSTRUCTION POINTER 15..0 € 6H Ñ‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ €IP 19..160 OPCODE 10..0 € 8H Ñ‘‘‘‘‘‘‘‘™‘™‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ € OPERAND POINTER 15..0 € AH Ñ‘‘‘‘‘‘‘‘˜‘˜‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ €OP 19..1600 0 0 0 0 0 0 0 0 0 0€ CH „¤¤Ï… 2.2 Computation Fundamentals This section covers 80387 programming concepts that are common to all applications. It describes the 80387's internal number system and the various types of numbers that can be employed in NPX programs. The most commonly used options for rounding and precision (selected by fields in the control word) are described, with exhaustive coverage of less frequently used facilities deferred to later sections. Exception conditions that may arise during execution of NPX instructions are also described along with the options that are available for responding to these exceptions. 2.2.1 Number System The system of real numbers that people use for pencil and paper calculations is conceptually infinite and continuous. There is no upper or lower limit to the magnitude of the numbers one can employ in a calculation, or to the precision (number of significant digits) that the numbers can represent. When considering any real number, there are always arbitrarily many numbers both larger and smaller. There are also arbitrarily many numbers between (i.e., with more significant digits than) any two real numbers. For example, between 2.5 and 2.6 are 2.51, 2.5897, 2.500001, etc. While ideally it would be desirable for a computer to be able to operate on the entire real number system, in practice this is not possible. Computers, no matter how large, ultimately have fixed-size registers and memories that limit the system of numbers that can be accommodated. These limitations determine both the range and the precision of numbers. The result is a set of numbers that is finite and discrete, rather than infinite and continuous. This sequence is a subset of the real numbers that is designed to form a useful approximation of the real number system. Figure 2-9 superimposes the basic 80387 real number system on a real number line (decimal numbers are shown for clarity, although the 80387 actually represents numbers in binary). The dots indicate the subset of real numbers the 80387 can represent as data and final results of calculations. The 80387's range of double-precision, normalized numbers is approximately ±2.23 * 10^(-308) to ±1.80 * 10^(308). Applications that are required to deal with data and final results outside this range are rare. For reference, the range of the IBM System 370* is about ±0.54 * 10^(-78) to ±0.72 * 10^(76). The finite spacing in Figure 2-9 illustrates that the NPX can represent a great many, but not all, of the real numbers in its range. There is always a gap between two adjacent 80387 numbers, and it is possible for the result of a calculation to fall in this space. When this occurs, the NPX rounds the true result to a number that it can represent. Thus, a real number that requires more digits than the 80387 can accommodate (e.g., a 20-digit number) is represented with some loss of accuracy. Notice also that the 80387's representable numbers are not distributed evenly along the real number line. In fact, an equal number of representable numbers exists between successive powers of 2 (i.e., as many representable numbers exist between 2 and 4 as between 65,536 and 131,072). Therefore, the gaps between representable numbers are larger as the numbers increase in magnitude. All integers in the range ±2^(64) (approximately ±10^(18)), however, are exactly representable. In its internal operations, the 80387 actually employs a number system that is a substantial superset of that shown in Figure 2-9. The internal format (called extended real) extends the 80387's range to about ±3.30 * 10^(-4932) to ±1.21 * 10^(4932), and its precision to about 19 (equivalent decimal) digits. This format is designed to provide extra range and precision for constants and intermediate results, and is not normally intended for data or final results. From a practical standpoint, the 80387's set of real numbers is sufficiently large and dense so as not to limit the vast majority of microprocessor applications. Compared to most computers, including mainframes, the NPX provides a very good approximation of the real number system. It is important to remember, however, that it is not an exact representation, and that arithmetic on real numbers is inherently approximate. Conversely, and equally important, the 80387 does perform exact arithmetic on integer operands. That is, if an operation on two integers is valid and produces a result that is in range, the result is exact. For example, 4 ÷ 2 yields an exact integer, 1 ÷ 3 does not, and 2^(40) * 2^(30) + 1 does not, because the result requires greater than 64 bits of precision. Figure 2-9. 80387 Double-Precision Number System |‘‘‘ NEGATIVE RANGE (NORMALIZED) ‘‘| | | | -5 -4 -3 -2 -1 | ’‘‘‘˜‘‘‘˜‘‘˜“’‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“ ››››››œœœœœœ ”‘‘‘™‘‘‘™‘‘™•”‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘•   -2.23 X 10^(-308)• ” -1.80 X 10^(308) ’‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ ‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘ œœœœœœœœœ |‘‘ POSITIVE RANGE (NORMALIZED) ‘‘‘| œœœœœœœœœ | | ‘¨‘‘‘‘‘¨‘‘‘‘‘¨‘‘‘ | 1 2 3 4 5 | ‘˜‘ ’‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“’˜‘‘˜‘‘‘˜‘‘‘“ ”2.00000000000000000 œœœœœœ›››››› ” (NOT REPRESENTABLE) ”‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘•”™‘‘™‘‘‘™‘‘‘• ”‘‘‘‘‘‘1.99999999999999999  ”‘‘‘—  PRECISION‘ 18 DIGITS ‘ ”‘‘‘‘‘‘‘‘“ 1.80 X 10^(308)• ” 2.23 X 10^(-308) ”‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• 2.2.2 Data Types and Formats The 80387 recognizes seven numeric data types for memory-based values, divided into three classes: binary integers, packed decimal integers, and binary reals. A later section describes how these formats are stored in memory (the sign is always located in the highest-addressed byte). Figure 2-10 summarizes the format of each data type. In the figure, the most significant digits of all numbers (and fields within numbers) are the leftmost digits. Figure 2-10. 80387 Data Formats ’‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ MOST HIGHEST ADDRESSED DATA RANGE PRECISIONSIGNIFICANT BYTE BYTE FORMATS –‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘˜‘‘‘“ 7 07 07 07 07 07 07 07 07 07 0 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘™‘‘‘— WORD –‘‘‘‘‘‘“(TWO'S INTEGER 10^(4) 16 BITS –‘‘‘‘‘‘•COMPLEMENT) 15 0 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘— SHORT –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“(TWO'S INTEGER 10^(2) 32 BITS –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•COMPLEMENT) 31 0 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘— LONG –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“(TWO'S INTEGER 10^(19) 64 BITS –‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘•COMPLEMENT) 6 0 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘— MAGNITUDE PACKED –‘˜‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘¨¨¨‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ BCD 10^(18) 18 DIGITSS X d{17} d{16} d{2} d{1} d{0} –‘™‘‘‘™‘‘‘‘‘™‘‘‘‘‘™‘¨¨¨‘™‘‘‘‘‘™‘‘‘‘‘™‘‘‘‘‘• 72 0 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘— –‘˜‘‘‘‘‘˜‘‘‘‘‘‘‘“ SINGLE 10^(±38) 24 BITS S BE SIGN. PRECISION –‘™‘‘‘‘‘™‘‘‘‘‘‘‘• 31 23 0 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘— –‘˜‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ DOUBLE 10^(±308) 53 BITS S BE SIGNIFICAND PRECISION –‘™‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• 63 52 0 –‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘š‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘— –‘˜‘‘‘‘‘‘‘‘‘‘‘‘˜‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘“ EXTENDED 10^(4932) 64 BITS S BE –‘“ SIGNIFICAND PRECISION –‘™‘‘‘‘‘‘‘‘‘‘‘‘™I™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• 79 64 63 0 ”‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘™‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘• ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ NOTE: (1) BE = BIASED EXPONENT (2) S = SIGN BIT (0 = positive, 1 = negative) (3) d{n} = DECIMAL DIGIT (TWO PER TYPE) (4) X = BITS HAVE NO SIGNIFICANCE; 80387 IGNORES WHEN LOADING, ZEROS IN WHEN STORING (5)  = POSITION OF IMPLICIT BINARY POINT (6) I = INTEGER BIT OF SIGNIFICAND; STORED IN TEMPORARY REAL, IMPLICIT IN SINGLE AND DOUBLE PRECISION (7) EXPONENT BIAS (NORMALIZED VALUES): SINGLE: 127 (7FH) DOUBLE: 1023 (3FFH) EXTENDED REAL: 16383 (3FFFH) (8) PACKED BCD: (-1)^(S) (D{17}...D{0}) (9) REAL: (-1)^(S) (2^(E-BIAS)) (F{0}F{1}...) ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ 2.2.2.1 Binary Integers The three binary integer formats are identical except for length, which governs the range that can be accommodated in each format. The leftmost bit is interpreted as the number's sign: 0 = positive and 1 = negative. Negative numbers are represented in standard two's complement notation (the binary integers are the only 80387 format to use two's complement). The quantity zero is represented with a positive sign (all bits are 0). The 80387 word integer format is identical to the 16-bit signed integer data type of the 80386; the 80387 short integer format is identical to the 32-bit signed integer data type of the 80386. The binary integer formats exist in memory only. When used by the 80387, they are automatically converted to the 80-bit extended real format. All binary integers are exactly representable in the extended real format. 2.2.2.2 Decimal Integers Decimal integers are stored in packed decimal notation, with two decimal digits "packed" into each byte, except the leftmost byte, which carries the sign bit (0 = positive, 1 = negative). Negative numbers are not stored in two's complement form and are distinguished from positive numbers only by the sign bit. The most significant digit of the number is the leftmost digit. All digits must be in the range 0-9. The decimal integer format exists in memory only. When used by the 80387, it is automatically converted to the 80-bit extended real format. All decimal integers are exactly representable in the extended real format. 2.2.2.3 Real Numbers The 80387 represents real numbers of the form: (-1)^(s)2^(E)(b{0}b{1}b{2}b{3}..b{p-1}) ...where... s = 0 or 1 E = any integer between Emin and Emax, inclusive b{i} = 0 or 1 p = number of bits of precision Table 2-3 summarizes the parameters for each of the three real-number formats. The 80387 stores real numbers in a three-field binary format that resembles scientific, or exponential, notation. The format consists of the following fields: Ž The number's significant digits are held in the significand field, b{0} b{1} b{2} b{3}..b{p-1}. (The term "significand" is analogous to the term "mantissa" used to describe floating point numbers on some computers.) Ž The exponent field, e = E+bias, locates the binary point within the significant digits (and therefore determines the number's magnitude). (The term "exponent" is analogous to the term "characteristic" used to describe floating point numbers on somecomputers.) Ž The 1-bit sign field indicates whether the number is positive or negative. Negative numbers differ from positive numbers only in the sign bits of their significands. Table 2-4 shows how the real number 178.125 (decimal) is stored in the 80387 single real format. The table lists a progression of equivalent notations that express the same value to show how a number can be converted from one form to another. (The ASM386 and PL/M-386 language translators perform a similar process when they encounter programmer-defined real number constants.) Note that not every decimal fraction has an exact binary equivalent. The decimal number 1/10, for example, cannot be expressed exactly in binary (just as the number 1/3 cannot be expressed exactly in decimal). When a translator encounters such a value, it produces a rounded binary approximation of the decimal value. The NPX usually carries the digits of the significand in normalized form. This means that, except for the value zero, the significand contains an integer bit and fraction bits as follows: 1{}fff...ff where {} indicates an assumed binary point. The number of fraction bits varies according to the real format: 23 for single, 52 for double, and 63 for extended real. By normalizing real numbers so that their integer bit is always a 1, the 80387 eliminates leading zeros in small values (X < 1). This technique maximizes the number of significant digits that can be accommodated in a significand of a given width. Note that, in the single and double formats, the integer bit is implicit and is not actually stored; the integer bit is physically present in the extended format only. If one were to examine only the significand with its assumed binary point, all normalized real numbers would have values greater than or equal to 1 and less than 2. The exponent field locates the actual binary point in the significant digits. Just as in decimal scientific notation, a positive exponent has the effect of moving the binary point to the right, and a negative exponent effectively moves the binary point to the left, inserting leading zeros as necessary. An unbiased exponent of zero indicates that the position of the assumed binary point is also the position of the actual binary point. The exponent field, then, determines a real number's magnitude. In order to simplify comparing real numbers (e.g., for sorting), the 80387 stores exponents in a biased form. This means that a constant is added to the true exponent described above. As Table 2-3 shows, the value of this bias is different for each real format. It has been chosen so as to force the biased exponent to be a positive value. This allows two real numbers (of the same format and sign) to be compared as if they are unsigned binary integers. That is, when comparing them bitwise from left to right (beginning with the leftmost exponent bit), the first bit position that differs orders the numbers; there is no need to proceed further with the comparison. A number's true exponent can be determined simply by subtracting the bias value of its format. The single and double real formats exist in memory only. If a number in one of these formats is loaded into an 80387 register, it is automatically converted to extended format, the format used for all internal operations. Likewise, data in registers can be converted to single or double real for storage in memory. The extended real format may be used in memory also, typically to store intermediate results that cannot be held in registers. Most applications should use the double format to store real-number data and results; it provides sufficient range and precision to return correct results with a minimum of programmer attention. The single real format is appropriate for applications that are constrained by memory, but it should be recognized that this format provides a smaller margin of safety. It is also useful for the debugging of algorithms, because roundoff problems will manifest themselves more quickly in this format. The extended real format should normally be reserved for holding intermediate results, loop accumulations, and constants. Its extra length is designed to shield final results from the effects of rounding and overflow/underflow in intermediate calculations. However, the range and precision of the double format are adequate for most microcomputer applications. Table 2-3. Summary of Format Parameters Parameter ’‘‘‘‘‘‘‘‘ Format ‘‘‘‘‘‘‘‘“ Single Double Extended Format width in bits 32 64 80 p (bits of precision) 24 53 64 Exponent width in bits 8 11 15 Emax +127 +1023 +16383 Emin -126 -1022 -16382 Exponent bias +127 +1023 +16383 Table 2-4. Real Number Notation Notation Value Ordinary Decimal 178.125 Scientific Decimal 1{}78125E2 Scientific Binary 1{}0110010001E111 Scientific Binary 1{}0110010001E10000110 (Biased Exponent) 80387 Single Format Sign Biased Exponent Significand (Normalized) 0 10000110 01100100010000000000000 1{}(implicit) 2.2.3 Rounding Control Internally, the 80387 employs three extra bits (guard, round, and sticky bits) that enable it to round numbers in accord with the infinitely precise true result of a computation; these bits are not accessible to programmers. Whenever the destination can represent the infinitely precise true result, the 80387 delivers it. Rounding occurs in arithmetic and store operations when the format of the destination cannot exactly represent the infinitely precise true result. For example, a real number may be rounded if it is stored in a shorter real format, or in an integer format. Or, the infinitely precise true result may be rounded when it is returned to a register. The NPX has four rounding modes, selectable by the RC field in the control word (see Figure 2-3). Given a true result b that cannot be represented by the target data type, the 80387 determines the two representable numbers a and c that most closely bracket b in value (a < b < c). The processor then rounds (changes) b to a or to c according to the mode selected by the RC field as shown in Table 2-5. Rounding introduces an error in a result that is less than one unit in the last place to which the result is rounded. Ž "Round to nearest" is the default mode and is suitable for most applications; it provides the most accurate and statistically unbiased estimate of the true result. Ž The "chop" or "round toward zero" mode is provided for integer arithmeticapplications. Ž "Round up" and "round down" are termed directed rounding and can be used to implement interval arithmetic. Interval arithmetic generates a certifiable result independent of the occurrence of rounding and other errors. The upper and lower bounds of an interval may be computed by executing an algorithm twice, rounding up in one pass and down in the other. Rounding control affects only the arithmetic instructions (refer to Chapter 3 for lists of arithmetic and nonarithmetic instructions). 2.2.4 Precision Control The 80387 allows results to be calculated with either 64, 53, or 24 bits of precision in the significand as selected by the precision control (PC) field of the control word. The default setting, and the one that is best suited for most applications, is the full 64 bits of significance provided by the extended real format. The other settings are required by the IEEE standard and are provided to obtain compatibility with the specifications of certain existing programming languages. Specifying less precision nullifies the advantages of the extended format's extended fraction length. When reduced precision is specified, the rounding of the fractional value clears the unused bits on the right to zeros. Table 2-5. Rounding Modes RC Field Rounding Mode Rounding Action 00 Round to nearest Closer to b of a or c; if equally close, select even number (the one whose least significant bit is zero). 01 Round down (toward -ý) a 10 Round up (toward +ý) c 11 Chop (toward 0) Smaller in magnitude of a or c. ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ NOTE a < b < c; a and c are successive representable numbers; b is not representable. ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Chapter 3 Special Computational Situations ‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘‘ Besides being able to represent positive and negative numbers, the 80387 data formats may be used to describe other entities. These special values provide extra flexibility, but most users will not need to understand them in order to use the 80387 successfully. This section describes the special values that may occur in certain cases and the significance of each. The 80387 exceptions are also described, for writers of exception handlers and for those interested in probing the limits of computation using the 80387. The material presented in this section is mainly of interest to programmers concerned with writing exception handlers. Many readers will only need to skim this section. When discussing these special computational situations, it is useful to distinguish between arithmetic instructions and nonarithmetic instructions. Nonarithmetic instructions are those that have no operands or transfer their operands without substantial change; arithmetic instructions are those that make significant changes to their operands. Table 3-1 defines these two classes of instructions. Table 3-1. Arithmetic and Nonarithmetic Instructions Nonarithmetic Instructions Arithmetic Instructions FABS F2XM1 FCHS FADD (P) FCLEX FBLD FDECSTP FBSTP FFREE FCOMP(P)(P) FINCSTP FCOS FINIT FDIV(R)(P) FLD (register-to-register) FIADD FLD (extended format from memory) FICOM(P) FLD constant FIDIV(R) FLDCW FILD FLDENV FIMUL FNOP FIST(P) FRSTOR FISUB(R) FSAVE FLD (conversion) FST(P) (register-to-register) FMUL(P) FSTP (extended format to memory) FPATAN FSTCW FPREM FSTENV FPREM1 FSTSW FPTAN FWAIT FRNDINT FXAM FSCALE FXCH FSIN FSINCOS FSQRT FST(P) (conversion) FSUB(R)(P) FTST FUCOM(P)(P) FXTRACT FYL2X FYL2XP1 3.1 Special Numeric Values The 80387 data formats encompass encodings for a variety of special values in addition to the typical real or integer data values that result from normal calculations. These special values have significance and can express relevant information about the computations or operations that produced them. The various types of special values are Ž Denormal real numbers Ž Zeros Ž Positive and negative infinity Ž NaN (Not-a-Number) Ž Indefinite Ž Unsupported formats The following sections explain the origins and significance of each of these special values. Tables 3-6 through 3-9 at the end of this section show how each of these special values is encoded for each of the numeric data types. 3.1.1 Denormal Real Numbers The 80387 generally stores nonzero real numbers in normalized floating-point form; that is, the integer (leading) bit of the significand is always a one. (Refer to Chapter 2 for a review of operand formats.) This bit is explicitly stored in the extended format, and is implicitly assumed to be a one (1{}) in the single and double formats. Since leading zeros are eliminated, normalized storage allows the maximum number of significant digits to be held in a significand of a given width. When a numeric value becomes very close to zero, normalized floating-point storage cannot be used to express the value accurately. The term tiny is used here to precisely define what values require special handling by the 80387. A number R is said to be tiny when -2{Emin} < R < 0 or 0 < R < +2{Emin}. (As defined in Chapter 2, Emin is -126 for single format, -1022 for double format, and -16382 for extended format.) In other words, a nonzero number is tiny if its exponent would be too negative to store in the destination format. To accommodate these instances, the 80387 can store and operate on reals that are not normalized, i.e., whose significands contain one or more leading zeros. Denormals typically arise when the result of a calculation yields a value that is tiny. Denormal values have the following properties: Ž The biased floating-point exponent is stored at its smallest value (zero) Ž The integer bit of the significand (whether explicit or implicit) is zero The leading zeros of denormals permit smaller numbers to be represented, at the possible cost of some lost precision (the number of significant bits is reduced by the leading zeros). In typical algorithms, extremely small values are most likely to be generated as intermediate, rather than final, results. By using the NPX's extended real format for holding intermediate values, quantities as small as ±3.4*10{-4932} can be represented; this makes the occurrence of denormal numbers a rare phenomenon in 80387 applications. Nevertheless, the NPX can load, store, and operate on denormalized real numbers when they do occur. Denormals receive special treatment by the 80387 in three respects: Ž The 80387 avoids creating denormals whenever possible. In other words, it always normalizes real numbers except in the case of tiny numbers. Ž The 80387 provides the unmasked underflow exception to permit programmers to detect cases when denormals would be created. Ž The 80387 provides the denormal exception to permit programmers to detect cases when denormals enter into further calculations. Denormalizing means incrementing the true result's exponent and inserting a corresponding leading zero in the significand, shifting the rest of the significand one place to the right. Denormal values may occur in any of the single, double, or extended formats. Table 3-2 illustrates how a result might be denormalized to fit a single format destination. Denormalization produces either a denormal or a zero. Denormals are readily identified by their exponents, which are always the minimum for their formats; in biased form, this is always the bit string: 00..00. This same exponent value is also assigned to the zeros, but a denormal has a nonzero significand. A denormal in a register is tagged special. Tables 3-8 and 3-9 show how denormal values are encoded in each of the real data formats. The denormalization process causes loss of significance if low-order one-bits bits are shifted off the right of the significand. In a severe case, all the significand bits of the true result are shifted out and replaced by the leading zeros. In this case, the result of denormalization is a true zero, and, if the value is in a register, it is tagged as a zero. Denormals are rarely encountered in most applications. Typical debugged algorithms generate extremely small results during the evaluation of intermediate subexpressions; the final result is usually of an appropriate magnitude for its single or double format real destination. If intermediate results are held in temporary real, as is recommended, the great range of this format makes underflow very unlikely. Denormals are likely to arise only when an application generates a great many intermediates, so many that they cannot be held on the register stack or in extended format memory variables. If storage limitations force the use of single or double format reals for intermediates, and small values are produced, underflow may occur, and, if masked, may generate denormals. When a denormal number is single or double format is used as a source operand and the denormal exception is masked, the 80387 automatically normalizes the number when it is converted to extended format. Table 3-2. Denormalization Process Operation Sign Exponent Significand True Result 0 -129 1{}01011100..00 Denormalize 0 -128 0{}101011100..00 Denormalize 0 -127 0{}0101011100..00 Denormalize 0 -126 0{}00101011100..00 Denormal Result 0 -126 0{}00101011100..00 3.1.1.1 Denormals and Gradual Underflow Floating-point arithmetic cannot carry out all operations exactly for all operands; approximation is unavoidable when the exact result is not representable as a floating-point variable. To keep the approximation mathematically tractable, the hardware is made to conform to accuracy standards that can be modeled by certain inequalities instead of equations. Let the assignment X  Y @ Z (where @ is some operation) represent a typical operation. In the default rounding mode (round to nearest), each operation is carried out with an absolute error no larger than half the separation between the two floating-point numbers closest to the exact results. Let x be the value stored for the variable whose name in the program is X, and similarly y for Y, and z for Z. Normally y and z will differ by accumulated errors from what is desired and from what would have been obtained in the absence of error. For the calculation of x we assume that y and z are the best approximations available, and we seek to compute x as well as we can. If y@z is representable exactly, then we expect x = y@z, and that is what we get for every algebraic operation on the 80387 (i.e., when y@z is one of y+z, y-z, y*z, y÷z, sqrt z). But if y@z must be approximated, as is usually the case, then x must differ from y@z by no more than half the difference between the two representable numbers that straddle y@z. That difference depends on two factors: 1. The precision to which the calculation is carried out, as determined either by the precision control bits or by the format used in memory. On the 80387, the precisions are single (24 significant bits), double (53 significant bits), and extended (64 significant bits). 2. How close y@z is to zero. In this respect the presence of denormal numbers on the 80387 provides a distinct advantage over systems that do not admit denormal numbers. In any floating-point number system, the density of representable numbers is greater near zero than near the largest representable magnitudes. However, machines that do not use denormal numbers suffer from an enormous gap between zero and its closest neighbors. Figures 3-1 and 3-2 show what happens near zero in two kinds of floating-point number systems. Figure 3-1 shows a floating-point number system that (like the 80387) admits denormal numbers. For simplicity, only the non-negative numbers appear and the figure illustrates a number system that carries just four significant bits instead of the 24, 53, or 64 significant bits that the 80387 offers. Each vertical mark stands for a number representable in four significant bits, and the bolder marks stand for the normal powers of 2. The denormal numbers lie between 0 and the nearest normal power of 2. They are no less dense than the remaining normal nonzero numbers. Figure 3-2 shows a floating-point number system that (unlike the 80387) does not admit denormal numbers. There are two yawning gaps, one on the positive side of zero (as illustrated) and one on the negative side of zero (not illustrated). The gap between zero and the nearest neighbor of zero differs from the gap between that neighbor and the next bigger number by a factor of about 8.4 * 10^(6) for single, 4.5 * 10^(15) for double, and 9.2*10^(18) for extended format. Those gaps would horribly complicate error analysis. The advantage of denormal numbers is apparent when one considers what happens in either case when the underflow exception is masked and y@z falls into the space between zero and the smallest normal magnitude. The 80387 returns the nearest denormal number. This action might be called "gradual underflow." The effect is no different than the rounding that can occur when y@z falls in the normal range. On the other hand, the system that does not have denormal numbers returns zero as the result, an action that can be much more inaccurate than rounding. This action could be called "abrupt underflow." Figure 3-1. Floating-Point System with Denormals 0++++++++++++++-+-+-+-+-+-+-+----+---+---+---+---+---+---+---------+... ”‘‘˜‘‘• - - - - - - - - Normal Numbers - - - - - - Denormals Figure 3-2. Floating-Point System without Denormals 0 +++++++-+-+-+-+-+-+-+----+---+---+---+---+---+---+---------+---... - - - - - - - - Normal Numbers - - - - - - 3.1.2 Zeros The value zero in the real and decimal integer formats may be signed either positive or negative, although the sign of a binary integer zero is always positive. For computational purposes, the value of zero always behaves identically, regardless of sign, and typically the fact that a zero may be signed is transparent to the programmer. If necessary, the FXAM instruction may be used to determine a zero's sign. If a zero is loaded or generated in a register, the register is tagged zero. Table 3-3 lists the results of instructions executed with zero operands and also shows how a zero may be created from nonzero operands. Table 3-3. Zero Operands and Results Key to symbols used in this table X and Y denote nonzero operand. * Sign of original zero operand. # Sign of original X operand. -# Compliment of sign of original X operand. Þ Exclusive OR of the signs of the operands. Operation Operands Result FLD,FBLD +0 +0 -0 -0 FILD +0 +0 FST,FSTP +0 +0 -0 -0 +X +0 When extreme underflow denormalizes the result to zero. -X -0 When extreme underflow denormalizes the result to zero. FBSTP +0 +0 -0 -0 FIST,FISTP +0 +0 -0 -0 +X +0 When 0 < X < 1 and rounding mode is not up. -X -0 When 0 < X < 1 and rounding mode is not up. Addition +0 plus +0 +0 -0 plus -0 -0 +0 plus -0, -0 plus +0 ±0 Sign determined by rounding mode: + for nearest, up, or chop, - for down. -X plus +X, +X plus -X ±0 Sign determined by rounding mode: + for nearest, up, or chop, - for down. ±0 plus ±X, ±X plus ±0 #X Subtraction +0 minus -0+0 -0 minus +0 -0 +0 minus +0, -0 minus -0 ±0 Sign determined by rounding mode: + for nearest, up, or chop, - for down. +X minus +X, -X minus -X ±0 Sign determined by rounding mode: + for nearest, up, or chop, - for down. ±0 minus ±X -#X ±X minus ±0 #X Multiplication +0 * +0, -0 * -0 +0 +0 * -0, -0 * +0 -0 +0 * +X, +X * +0 +0 +0 * -X, -X * +0 -0 -0 * +X, -X * +0 -0 Multiplication -0 * -X, -X * -0 +0 +X * +Y, -X * -Y +0 When extreme underflow denormalizes the result to zero. +X * -Y, -X * +Y -0 When extreme underflow denormalizes the result to zero. Division ±0 ÷ ±0 Invalid Operation ±X ÷ ±0 Þý (Zero Divide) +0 ÷ +X, -0 ÷ -X +0 +0 ÷ -X, -0 ÷ +X -0 -X ÷ -Y, +X ÷ +Y +0 When extreme underflow denormalizes the result to zero. -X ÷ +Y, +X ÷ -Y -0 When extreme underflow denormalizes the result to zero. FPREM, FPREM1 ±0 rem ±0 Invalid Operation ±X rem ±0 Invalid Operation +0 rem ±X +0 -0 rem ±X -0 FPREM +X rem ±Y +0 Y exactly divides X -X rem ±Y -0 Y exactly divides X FPREM1 +X rem ±Y +0 Y exactly divides X -X rem ±Y -0 Y exactly divides X FSQRT +0 +0 -0 -0 Compare ±0 : +X ±0 < +X ±0 : ±0 ±0 = ±0 ±0 : -X ±0 > -X FTST ±0 ±0 = 0 +0 C{3}=1; C{2}=C{1}=C{0}=0 -0 C{3}=C{1}=1; C{2}=C{0}=0 FCHS +0 -0 -0 +0 FABS ±0 +0 F2XM1 +0 +0 -0 -0 FRNDINT +0 +0 -0 -0 FSCALE ±0 scaled by -ý *0 ±0 scaled by +ý Invalid Operation ±0 scaled by X *0 FXTRACT +0 ST=+0,ST(1)=-ý, Zero divide -0 ST=-0,ST(1)=-ý, Zero divide FPTAN±0 *0 FSIN (or ±0 *0 SIN result of FSINCOS) FCOS (or ±0 +1 COS result of FSINCOS) FPATAN ±0 ÷ +X *0 ±0 ÷ -X *Ò ±X ÷ ±0 #Ò/2 ±0 ÷ +0 *0 ±0 ÷ -0 *Ò +ý ÷ ±0 +Ò/2 -ý ÷ ±0 -Ò/2 ±0 ÷ +ý *0 ±0 ÷ -ý *Ò FYL2X ±Y * log(±0) Zero Divide ±0 * log(±0) Invalid Operation FYL2XP1 +Y * log(±0+1) *0 -Y * log(±0+1) -0 3.1.3 Infinity The real formats support signed representations of infinities. These values are encoded with a biased exponent of all ones and a significand of 1{}00..00; if the infinity is in a register, it is tagged special. A programmer may code an infinity, or it may be created by the NPX as its masked response to an overflow or a zero divide exception. Note that depending on rounding mode, the masked response may create the largest valid value representable in the destination rather than infinity. The signs of the infinities are observed, and comparisons are possible. Infinities are always interpreted in the affine sense; that is, -ý < (any finite number) < +ý. Arithmetic on infinities is always exact and, therefore, signals no exceptions, except for the invalid operations specified in Table 3-4. Table 3-4. Infinity Operands and Results Key to symbols used in this table X Zero or nonzero positive oprand. Y Nonzero positive operand. * Sign of original infinity operand. -* Compliment of sign of original infinity operand. $ Sign of original operand. # Sign of the original Y operand. Þ Exclusive OR of signs of operands. Operation Operands Result Addition +ý plus +ý +ý -ý plus -ý -ý +ý plus -ý Invalid Operation -ý plus +ý Invalid Operation ±ý plus ±X *ý ±X plus ±ý *ý Subtraction +ý minus -ý +ý -ý minus +ý -ý +ý minus +ý Invalid Operation -ý minus -ý Invalid Operation ±ý minus ±X *ý ±X minus ±ý -*ý Multiplication ±ý * ±ý Þý ±ý * ±Y, ±Y * ±ý Þý ±0 * ±ý, ±ý * ±0 Invalid Operation Division ±ý ÷ ±ý Invalid Operation ±ý ÷ ±X Þý ±X ÷ ±ý Þ0 ±ý ÷ ±0 Þý FSQRT -ý Invalid Operation +ý +ý FPREM, FPREM1 ±ý rem ±ý Invalid Operation ±ý rem ±X Invalid Operation ±X rem ±ý $X, Q = 0 FRNDINT ±ý *ý FSCALE ±ý scaled by --ý Invalid Operation ±ý scaled by +ý *ý ±ý scaled by ±X *ý ±0 scaled by -ý ±0 Sign of original zero operand. ±0 scaled by ýI Invalid Operation ±Y scaled by +ý #ý ±Y scaled by -ý #0 FXTRACT ±ý ST = *ý, ST(1) = +ý Compare +ý : +ý +ý = +ý -ý : -ý -ý = -ý +ý : -ý +ý > -ý -ý : +ý -ý < +ý +ý : ±X +ý > X -ý : ±X -ý < X ±X : +ý X < +ý ±X : -ý X > +ý FTST +ý +ý > 0 -ý -ý < 0 FPATAN ±ý ÷ ±X *Ò/2 ±Y ÷ +ý #0 ±Y ÷ -ý #Ò ±ý ÷ +ý *Ò/4 ±ý ÷ -ý *3Ò/4 ±ý ÷ ±0 *Ò/2 +0 ÷ +ý +0 +0 ÷ -ý +Ò -0 ÷ +ý -0 -0 ÷ -ý -Ò F2XM1 +ý +ý -ý -1 FYL2X, FYL2XP1 ±ý * log(1) Invalid Operation ±ý * log(Y>1) *ý ±ý * log(0