292 lines
39 KiB
HTML
292 lines
39 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>Integers and Floating-Point Numbers · The Julia Language</title><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
|
||
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
|
||
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
|
||
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
|
||
|
||
ga('create', 'UA-28835595-6', 'auto');
|
||
ga('send', 'pageview');
|
||
</script><link href="https://cdnjs.cloudflare.com/ajax/libs/normalize/4.2.0/normalize.min.css" rel="stylesheet" type="text/css"/><link href="https://fonts.googleapis.com/css?family=Lato|Roboto+Mono" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.2.0/require.min.js" data-main="../assets/documenter.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link href="../assets/highlightjs/default.css" rel="stylesheet" type="text/css"/><link href="../assets/documenter.css" rel="stylesheet" type="text/css"/></head><body><nav class="toc"><a href="../index.html"><img class="logo" src="../assets/logo.png" alt="The Julia Language logo"/></a><h1>The Julia Language</h1><select id="version-selector" onChange="window.location.href=this.value" style="visibility: hidden"></select><form class="search" action="../search.html"><input id="search-query" name="q" type="text" placeholder="Search docs"/></form><ul><li><a class="toctext" href="../index.html">Home</a></li><li><span class="toctext">Manual</span><ul><li><a class="toctext" href="introduction.html">Introduction</a></li><li><a class="toctext" href="getting-started.html">Getting Started</a></li><li><a class="toctext" href="variables.html">Variables</a></li><li class="current"><a class="toctext" href="integers-and-floating-point-numbers.html">Integers and Floating-Point Numbers</a><ul class="internal"><li><a class="toctext" href="#Integers-1">Integers</a></li><li><a class="toctext" href="#Floating-Point-Numbers-1">Floating-Point Numbers</a></li><li><a class="toctext" href="#Arbitrary-Precision-Arithmetic-1">Arbitrary Precision Arithmetic</a></li><li><a class="toctext" href="#man-numeric-literal-coefficients-1">Numeric Literal Coefficients</a></li><li><a class="toctext" href="#Literal-zero-and-one-1">Literal zero and one</a></li></ul></li><li><a class="toctext" href="mathematical-operations.html">Mathematical Operations and Elementary Functions</a></li><li><a class="toctext" href="complex-and-rational-numbers.html">Complex and Rational Numbers</a></li><li><a class="toctext" href="strings.html">Strings</a></li><li><a class="toctext" href="functions.html">Functions</a></li><li><a class="toctext" href="control-flow.html">Control Flow</a></li><li><a class="toctext" href="variables-and-scoping.html">Scope of Variables</a></li><li><a class="toctext" href="types.html">Types</a></li><li><a class="toctext" href="methods.html">Methods</a></li><li><a class="toctext" href="constructors.html">Constructors</a></li><li><a class="toctext" href="conversion-and-promotion.html">Conversion and Promotion</a></li><li><a class="toctext" href="interfaces.html">Interfaces</a></li><li><a class="toctext" href="modules.html">Modules</a></li><li><a class="toctext" href="documentation.html">Documentation</a></li><li><a class="toctext" href="metaprogramming.html">Metaprogramming</a></li><li><a class="toctext" href="arrays.html">Multi-dimensional Arrays</a></li><li><a class="toctext" href="linear-algebra.html">Linear algebra</a></li><li><a class="toctext" href="networking-and-streams.html">Networking and Streams</a></li><li><a class="toctext" href="parallel-computing.html">Parallel Computing</a></li><li><a class="toctext" href="dates.html">Date and DateTime</a></li><li><a class="toctext" href="interacting-with-julia.html">Interacting With Julia</a></li><li><a class="toctext" href="running-external-programs.html">Running External Programs</a></li><li><a class="toctext" href="calling-c-and-fortran-code.html">Calling C and Fortran Code</a></li><li><a class="toctext" href="handling-operating-system-variation.html">Handling Operating System Variation</a></li><li><a class="toctext" href="environment-variables.html">Environment Variables</a></li><li><a class="toctext" href="embedding.html">Embedding Julia</a></li><li><a class="toctext" href="packages.html">Packages</a></li><li><a class="toctext" href="profile.html">Profiling</a></li><li><a class="toctext" href="stacktraces.html">Stack Traces</a></li><li><a class="toctext" href="performance-tips.html">Performance Tips</a></li><li><a class="toctext" href="workflow-tips.html">Workflow Tips</a></li><li><a class="toctext" href="style-guide.html">Style Guide</a></li><li><a class="toctext" href="faq.html">Frequently Asked Questions</a></li><li><a class="toctext" href="noteworthy-differences.html">Noteworthy Differences from other Languages</a></li><li><a class="toctext" href="unicode-input.html">Unicode Input</a></li></ul></li><li><span class="toctext">Standard Library</span><ul><li><a class="toctext" href="../stdlib/base.html">Essentials</a></li><li><a class="toctext" href="../stdlib/collections.html">Collections and Data Structures</a></li><li><a class="toctext" href="../stdlib/math.html">Mathematics</a></li><li><a class="toctext" href="../stdlib/numbers.html">Numbers</a></li><li><a class="toctext" href="../stdlib/strings.html">Strings</a></li><li><a class="toctext" href="../stdlib/arrays.html">Arrays</a></li><li><a class="toctext" href="../stdlib/parallel.html">Tasks and Parallel Computing</a></li><li><a class="toctext" href="../stdlib/linalg.html">Linear Algebra</a></li><li><a class="toctext" href="../stdlib/constants.html">Constants</a></li><li><a class="toctext" href="../stdlib/file.html">Filesystem</a></li><li><a class="toctext" href="../stdlib/io-network.html">I/O and Network</a></li><li><a class="toctext" href="../stdlib/punctuation.html">Punctuation</a></li><li><a class="toctext" href="../stdlib/sort.html">Sorting and Related Functions</a></li><li><a class="toctext" href="../stdlib/pkg.html">Package Manager Functions</a></li><li><a class="toctext" href="../stdlib/dates.html">Dates and Time</a></li><li><a class="toctext" href="../stdlib/iterators.html">Iteration utilities</a></li><li><a class="toctext" href="../stdlib/test.html">Unit Testing</a></li><li><a class="toctext" href="../stdlib/c.html">C Interface</a></li><li><a class="toctext" href="../stdlib/libc.html">C Standard Library</a></li><li><a class="toctext" href="../stdlib/libdl.html">Dynamic Linker</a></li><li><a class="toctext" href="../stdlib/profile.html">Profiling</a></li><li><a class="toctext" href="../stdlib/stacktraces.html">StackTraces</a></li><li><a class="toctext" href="../stdlib/simd-types.html">SIMD Support</a></li></ul></li><li><span class="toctext">Developer Documentation</span><ul><li><a class="toctext" href="../devdocs/reflection.html">Reflection and introspection</a></li><li><span class="toctext">Documentation of Julia's Internals</span><ul><li><a class="toctext" href="../devdocs/init.html">Initialization of the Julia runtime</a></li><li><a class="toctext" href="../devdocs/ast.html">Julia ASTs</a></li><li><a class="toctext" href="../devdocs/types.html">More about types</a></li><li><a class="toctext" href="../devdocs/object.html">Memory layout of Julia Objects</a></li><li><a class="toctext" href="../devdocs/eval.html">Eval of Julia code</a></li><li><a class="toctext" href="../devdocs/callconv.html">Calling Conventions</a></li><li><a class="toctext" href="../devdocs/compiler.html">High-level Overview of the Native-Code Generation Process</a></li><li><a class="toctext" href="../devdocs/functions.html">Julia Functions</a></li><li><a class="toctext" href="../devdocs/cartesian.html">Base.Cartesian</a></li><li><a class="toctext" href="../devdocs/meta.html">Talking to the compiler (the <code>:meta</code> mechanism)</a></li><li><a class="toctext" href="../devdocs/subarrays.html">SubArrays</a></li><li><a class="toctext" href="../devdocs/sysimg.html">System Image Building</a></li><li><a class="toctext" href="../devdocs/llvm.html">Working with LLVM</a></li><li><a class="toctext" href="../devdocs/stdio.html">printf() and stdio in the Julia runtime</a></li><li><a class="toctext" href="../devdocs/boundscheck.html">Bounds checking</a></li><li><a class="toctext" href="../devdocs/locks.html">Proper maintenance and care of multi-threading locks</a></li><li><a class="toctext" href="../devdocs/offset-arrays.html">Arrays with custom indices</a></li><li><a class="toctext" href="../devdocs/libgit2.html">Base.LibGit2</a></li><li><a class="toctext" href="../devdocs/require.html">Module loading</a></li></ul></li><li><span class="toctext">Developing/debugging Julia's C code</span><ul><li><a class="toctext" href="../devdocs/backtraces.html">Reporting and analyzing crashes (segfaults)</a></li><li><a class="toctext" href="../devdocs/debuggingtips.html">gdb debugging tips</a></li><li><a class="toctext" href="../devdocs/valgrind.html">Using Valgrind with Julia</a></li><li><a class="toctext" href="../devdocs/sanitizers.html">Sanitizer support</a></li></ul></li></ul></li></ul></nav><article id="docs"><header><nav><ul><li>Manual</li><li><a href="integers-and-floating-point-numbers.html">Integers and Floating-Point Numbers</a></li></ul><a class="edit-page" href="https://github.com/JuliaLang/julia/tree/d386e40c17d43b79fc89d3e579fc04547241787c/doc/src/manual/integers-and-floating-point-numbers.md"><span class="fa"></span> Edit on GitHub</a></nav><hr/><div id="topbar"><span>Integers and Floating-Point Numbers</span><a class="fa fa-bars" href="#"></a></div></header><h1><a class="nav-anchor" id="Integers-and-Floating-Point-Numbers-1" href="#Integers-and-Floating-Point-Numbers-1">Integers and Floating-Point Numbers</a></h1><p>Integers and floating-point values are the basic building blocks of arithmetic and computation. Built-in representations of such values are called numeric primitives, while representations of integers and floating-point numbers as immediate values in code are known as numeric literals. For example, <code>1</code> is an integer literal, while <code>1.0</code> is a floating-point literal; their binary in-memory representations as objects are numeric primitives.</p><p>Julia provides a broad range of primitive numeric types, and a full complement of arithmetic and bitwise operators as well as standard mathematical functions are defined over them. These map directly onto numeric types and operations that are natively supported on modern computers, thus allowing Julia to take full advantage of computational resources. Additionally, Julia provides software support for <a href="integers-and-floating-point-numbers.html#Arbitrary-Precision-Arithmetic-1">Arbitrary Precision Arithmetic</a>, which can handle operations on numeric values that cannot be represented effectively in native hardware representations, but at the cost of relatively slower performance.</p><p>The following are Julia's primitive numeric types:</p><ul><li><p><strong>Integer types:</strong></p></li></ul><table><tr><th>Type</th><th>Signed?</th><th>Number of bits</th><th>Smallest value</th><th>Largest value</th></tr><tr><td><a href="../stdlib/numbers.html#Core.Int8"><code>Int8</code></a></td><td>✓</td><td>8</td><td>-2^7</td><td>2^7 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.UInt8"><code>UInt8</code></a></td><td> </td><td>8</td><td>0</td><td>2^8 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.Int16"><code>Int16</code></a></td><td>✓</td><td>16</td><td>-2^15</td><td>2^15 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.UInt16"><code>UInt16</code></a></td><td> </td><td>16</td><td>0</td><td>2^16 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.Int32"><code>Int32</code></a></td><td>✓</td><td>32</td><td>-2^31</td><td>2^31 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.UInt32"><code>UInt32</code></a></td><td> </td><td>32</td><td>0</td><td>2^32 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.Int64"><code>Int64</code></a></td><td>✓</td><td>64</td><td>-2^63</td><td>2^63 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.UInt64"><code>UInt64</code></a></td><td> </td><td>64</td><td>0</td><td>2^64 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.Int128"><code>Int128</code></a></td><td>✓</td><td>128</td><td>-2^127</td><td>2^127 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.UInt128"><code>UInt128</code></a></td><td> </td><td>128</td><td>0</td><td>2^128 - 1</td></tr><tr><td><a href="../stdlib/numbers.html#Core.Bool"><code>Bool</code></a></td><td>N/A</td><td>8</td><td><code>false</code> (0)</td><td><code>true</code> (1)</td></tr></table><ul><li><p><strong>Floating-point types:</strong></p></li></ul><table><tr><th>Type</th><th>Precision</th><th>Number of bits</th></tr><tr><td><a href="../stdlib/numbers.html#Core.Float16"><code>Float16</code></a></td><td><a href="https://en.wikipedia.org/wiki/Half-precision_floating-point_format">half</a></td><td>16</td></tr><tr><td><a href="../stdlib/numbers.html#Core.Float32"><code>Float32</code></a></td><td><a href="https://en.wikipedia.org/wiki/Single_precision_floating-point_format">single</a></td><td>32</td></tr><tr><td><a href="../stdlib/numbers.html#Core.Float64"><code>Float64</code></a></td><td><a href="https://en.wikipedia.org/wiki/Double_precision_floating-point_format">double</a></td><td>64</td></tr></table><p>Additionally, full support for <a href="complex-and-rational-numbers.html#Complex-and-Rational-Numbers-1">Complex and Rational Numbers</a> is built on top of these primitive numeric types. All numeric types interoperate naturally without explicit casting, thanks to a flexible, user-extensible <a href="conversion-and-promotion.html#conversion-and-promotion-1">type promotion system</a>.</p><h2><a class="nav-anchor" id="Integers-1" href="#Integers-1">Integers</a></h2><p>Literal integers are represented in the standard manner:</p><pre><code class="language-julia-repl">julia> 1
|
||
1
|
||
|
||
julia> 1234
|
||
1234</code></pre><p>The default type for an integer literal depends on whether the target system has a 32-bit architecture or a 64-bit architecture:</p><pre><code class="language-julia-repl"># 32-bit system:
|
||
julia> typeof(1)
|
||
Int32
|
||
|
||
# 64-bit system:
|
||
julia> typeof(1)
|
||
Int64</code></pre><p>The Julia internal variable <a href="../stdlib/constants.html#Base.Sys.WORD_SIZE"><code>Sys.WORD_SIZE</code></a> indicates whether the target system is 32-bit or 64-bit:</p><pre><code class="language-julia-repl"># 32-bit system:
|
||
julia> Sys.WORD_SIZE
|
||
32
|
||
|
||
# 64-bit system:
|
||
julia> Sys.WORD_SIZE
|
||
64</code></pre><p>Julia also defines the types <code>Int</code> and <code>UInt</code>, which are aliases for the system's signed and unsigned native integer types respectively:</p><pre><code class="language-julia-repl"># 32-bit system:
|
||
julia> Int
|
||
Int32
|
||
julia> UInt
|
||
UInt32
|
||
|
||
# 64-bit system:
|
||
julia> Int
|
||
Int64
|
||
julia> UInt
|
||
UInt64</code></pre><p>Larger integer literals that cannot be represented using only 32 bits but can be represented in 64 bits always create 64-bit integers, regardless of the system type:</p><pre><code class="language-julia-repl"># 32-bit or 64-bit system:
|
||
julia> typeof(3000000000)
|
||
Int64</code></pre><p>Unsigned integers are input and output using the <code>0x</code> prefix and hexadecimal (base 16) digits <code>0-9a-f</code> (the capitalized digits <code>A-F</code> also work for input). The size of the unsigned value is determined by the number of hex digits used:</p><pre><code class="language-julia-repl">julia> 0x1
|
||
0x01
|
||
|
||
julia> typeof(ans)
|
||
UInt8
|
||
|
||
julia> 0x123
|
||
0x0123
|
||
|
||
julia> typeof(ans)
|
||
UInt16
|
||
|
||
julia> 0x1234567
|
||
0x01234567
|
||
|
||
julia> typeof(ans)
|
||
UInt32
|
||
|
||
julia> 0x123456789abcdef
|
||
0x0123456789abcdef
|
||
|
||
julia> typeof(ans)
|
||
UInt64</code></pre><p>This behavior is based on the observation that when one uses unsigned hex literals for integer values, one typically is using them to represent a fixed numeric byte sequence, rather than just an integer value.</p><p>Recall that the variable <a href="../stdlib/base.html#ans"><code>ans</code></a> is set to the value of the last expression evaluated in an interactive session. This does not occur when Julia code is run in other ways.</p><p>Binary and octal literals are also supported:</p><pre><code class="language-julia-repl">julia> 0b10
|
||
0x02
|
||
|
||
julia> typeof(ans)
|
||
UInt8
|
||
|
||
julia> 0o10
|
||
0x08
|
||
|
||
julia> typeof(ans)
|
||
UInt8</code></pre><p>The minimum and maximum representable values of primitive numeric types such as integers are given by the <a href="../stdlib/base.html#Base.typemin"><code>typemin()</code></a> and <a href="../stdlib/base.html#Base.typemax"><code>typemax()</code></a> functions:</p><pre><code class="language-julia-repl">julia> (typemin(Int32), typemax(Int32))
|
||
(-2147483648, 2147483647)
|
||
|
||
julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128]
|
||
println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]")
|
||
end
|
||
Int8: [-128,127]
|
||
Int16: [-32768,32767]
|
||
Int32: [-2147483648,2147483647]
|
||
Int64: [-9223372036854775808,9223372036854775807]
|
||
Int128: [-170141183460469231731687303715884105728,170141183460469231731687303715884105727]
|
||
UInt8: [0,255]
|
||
UInt16: [0,65535]
|
||
UInt32: [0,4294967295]
|
||
UInt64: [0,18446744073709551615]
|
||
UInt128: [0,340282366920938463463374607431768211455]</code></pre><p>The values returned by <a href="../stdlib/base.html#Base.typemin"><code>typemin()</code></a> and <a href="../stdlib/base.html#Base.typemax"><code>typemax()</code></a> are always of the given argument type. (The above expression uses several features we have yet to introduce, including <a href="control-flow.html#man-loops-1">for loops</a>, <a href="strings.html#man-strings-1">Strings</a>, and <a href="metaprogramming.html#Interpolation-1">Interpolation</a>, but should be easy enough to understand for users with some existing programming experience.)</p><h3><a class="nav-anchor" id="Overflow-behavior-1" href="#Overflow-behavior-1">Overflow behavior</a></h3><p>In Julia, exceeding the maximum representable value of a given type results in a wraparound behavior:</p><pre><code class="language-julia-repl">julia> x = typemax(Int64)
|
||
9223372036854775807
|
||
|
||
julia> x + 1
|
||
-9223372036854775808
|
||
|
||
julia> x + 1 == typemin(Int64)
|
||
true</code></pre><p>Thus, arithmetic with Julia integers is actually a form of <a href="https://en.wikipedia.org/wiki/Modular_arithmetic">modular arithmetic</a>. This reflects the characteristics of the underlying arithmetic of integers as implemented on modern computers. In applications where overflow is possible, explicit checking for wraparound produced by overflow is essential; otherwise, the <a href="../stdlib/numbers.html#Base.GMP.BigInt"><code>BigInt</code></a> type in <a href="integers-and-floating-point-numbers.html#Arbitrary-Precision-Arithmetic-1">Arbitrary Precision Arithmetic</a> is recommended instead.</p><h3><a class="nav-anchor" id="Division-errors-1" href="#Division-errors-1">Division errors</a></h3><p>Integer division (the <code>div</code> function) has two exceptional cases: dividing by zero, and dividing the lowest negative number (<a href="../stdlib/base.html#Base.typemin"><code>typemin()</code></a>) by -1. Both of these cases throw a <a href="../stdlib/base.html#Core.DivideError"><code>DivideError</code></a>. The remainder and modulus functions (<code>rem</code> and <code>mod</code>) throw a <a href="../stdlib/base.html#Core.DivideError"><code>DivideError</code></a> when their second argument is zero.</p><h2><a class="nav-anchor" id="Floating-Point-Numbers-1" href="#Floating-Point-Numbers-1">Floating-Point Numbers</a></h2><p>Literal floating-point numbers are represented in the standard formats:</p><pre><code class="language-julia-repl">julia> 1.0
|
||
1.0
|
||
|
||
julia> 1.
|
||
1.0
|
||
|
||
julia> 0.5
|
||
0.5
|
||
|
||
julia> .5
|
||
0.5
|
||
|
||
julia> -1.23
|
||
-1.23
|
||
|
||
julia> 1e10
|
||
1.0e10
|
||
|
||
julia> 2.5e-4
|
||
0.00025</code></pre><p>The above results are all <a href="../stdlib/numbers.html#Core.Float64"><code>Float64</code></a> values. Literal <a href="../stdlib/numbers.html#Core.Float32"><code>Float32</code></a> values can be entered by writing an <code>f</code> in place of <code>e</code>:</p><pre><code class="language-julia-repl">julia> 0.5f0
|
||
0.5f0
|
||
|
||
julia> typeof(ans)
|
||
Float32
|
||
|
||
julia> 2.5f-4
|
||
0.00025f0</code></pre><p>Values can be converted to <a href="../stdlib/numbers.html#Core.Float32"><code>Float32</code></a> easily:</p><pre><code class="language-julia-repl">julia> Float32(-1.5)
|
||
-1.5f0
|
||
|
||
julia> typeof(ans)
|
||
Float32</code></pre><p>Hexadecimal floating-point literals are also valid, but only as <a href="../stdlib/numbers.html#Core.Float64"><code>Float64</code></a> values:</p><pre><code class="language-julia-repl">julia> 0x1p0
|
||
1.0
|
||
|
||
julia> 0x1.8p3
|
||
12.0
|
||
|
||
julia> 0x.4p-1
|
||
0.125
|
||
|
||
julia> typeof(ans)
|
||
Float64</code></pre><p>Half-precision floating-point numbers are also supported (<a href="../stdlib/numbers.html#Core.Float16"><code>Float16</code></a>), but they are implemented in software and use <a href="../stdlib/numbers.html#Core.Float32"><code>Float32</code></a> for calculations.</p><pre><code class="language-julia-repl">julia> sizeof(Float16(4.))
|
||
2
|
||
|
||
julia> 2*Float16(4.)
|
||
Float16(8.0)</code></pre><p>The underscore <code>_</code> can be used as digit separator:</p><pre><code class="language-julia-repl">julia> 10_000, 0.000_000_005, 0xdead_beef, 0b1011_0010
|
||
(10000, 5.0e-9, 0xdeadbeef, 0xb2)</code></pre><h3><a class="nav-anchor" id="Floating-point-zero-1" href="#Floating-point-zero-1">Floating-point zero</a></h3><p>Floating-point numbers have <a href="https://en.wikipedia.org/wiki/Signed_zero">two zeros</a>, positive zero and negative zero. They are equal to each other but have different binary representations, as can be seen using the <code>bits</code> function: :</p><pre><code class="language-julia-repl">julia> 0.0 == -0.0
|
||
true
|
||
|
||
julia> bits(0.0)
|
||
"0000000000000000000000000000000000000000000000000000000000000000"
|
||
|
||
julia> bits(-0.0)
|
||
"1000000000000000000000000000000000000000000000000000000000000000"</code></pre><h3><a class="nav-anchor" id="Special-floating-point-values-1" href="#Special-floating-point-values-1">Special floating-point values</a></h3><p>There are three specified standard floating-point values that do not correspond to any point on the real number line:</p><table><tr><th><code>Float16</code></th><th><code>Float32</code></th><th><code>Float64</code></th><th>Name</th><th>Description</th></tr><tr><td><code>Inf16</code></td><td><code>Inf32</code></td><td><code>Inf</code></td><td>positive infinity</td><td>a value greater than all finite floating-point values</td></tr><tr><td><code>-Inf16</code></td><td><code>-Inf32</code></td><td><code>-Inf</code></td><td>negative infinity</td><td>a value less than all finite floating-point values</td></tr><tr><td><code>NaN16</code></td><td><code>NaN32</code></td><td><code>NaN</code></td><td>not a number</td><td>a value not <code>==</code> to any floating-point value (including itself)</td></tr></table><p>For further discussion of how these non-finite floating-point values are ordered with respect to each other and other floats, see <a href="mathematical-operations.html#Numeric-Comparisons-1">Numeric Comparisons</a>. By the <a href="https://en.wikipedia.org/wiki/IEEE_754-2008">IEEE 754 standard</a>, these floating-point values are the results of certain arithmetic operations:</p><pre><code class="language-julia-repl">julia> 1/Inf
|
||
0.0
|
||
|
||
julia> 1/0
|
||
Inf
|
||
|
||
julia> -5/0
|
||
-Inf
|
||
|
||
julia> 0.000001/0
|
||
Inf
|
||
|
||
julia> 0/0
|
||
NaN
|
||
|
||
julia> 500 + Inf
|
||
Inf
|
||
|
||
julia> 500 - Inf
|
||
-Inf
|
||
|
||
julia> Inf + Inf
|
||
Inf
|
||
|
||
julia> Inf - Inf
|
||
NaN
|
||
|
||
julia> Inf * Inf
|
||
Inf
|
||
|
||
julia> Inf / Inf
|
||
NaN
|
||
|
||
julia> 0 * Inf
|
||
NaN</code></pre><p>The <a href="../stdlib/base.html#Base.typemin"><code>typemin()</code></a> and <a href="../stdlib/base.html#Base.typemax"><code>typemax()</code></a> functions also apply to floating-point types:</p><pre><code class="language-julia-repl">julia> (typemin(Float16),typemax(Float16))
|
||
(-Inf16, Inf16)
|
||
|
||
julia> (typemin(Float32),typemax(Float32))
|
||
(-Inf32, Inf32)
|
||
|
||
julia> (typemin(Float64),typemax(Float64))
|
||
(-Inf, Inf)</code></pre><h3><a class="nav-anchor" id="Machine-epsilon-1" href="#Machine-epsilon-1">Machine epsilon</a></h3><p>Most real numbers cannot be represented exactly with floating-point numbers, and so for many purposes it is important to know the distance between two adjacent representable floating-point numbers, which is often known as <a href="https://en.wikipedia.org/wiki/Machine_epsilon">machine epsilon</a>.</p><p>Julia provides <a href="../stdlib/dates.html#Base.eps"><code>eps()</code></a>, which gives the distance between <code>1.0</code> and the next larger representable floating-point value:</p><pre><code class="language-julia-repl">julia> eps(Float32)
|
||
1.1920929f-7
|
||
|
||
julia> eps(Float64)
|
||
2.220446049250313e-16
|
||
|
||
julia> eps() # same as eps(Float64)
|
||
2.220446049250313e-16</code></pre><p>These values are <code>2.0^-23</code> and <code>2.0^-52</code> as <a href="../stdlib/numbers.html#Core.Float32"><code>Float32</code></a> and <a href="../stdlib/numbers.html#Core.Float64"><code>Float64</code></a> values, respectively. The <a href="../stdlib/dates.html#Base.eps"><code>eps()</code></a> function can also take a floating-point value as an argument, and gives the absolute difference between that value and the next representable floating point value. That is, <code>eps(x)</code> yields a value of the same type as <code>x</code> such that <code>x + eps(x)</code> is the next representable floating-point value larger than <code>x</code>:</p><pre><code class="language-julia-repl">julia> eps(1.0)
|
||
2.220446049250313e-16
|
||
|
||
julia> eps(1000.)
|
||
1.1368683772161603e-13
|
||
|
||
julia> eps(1e-27)
|
||
1.793662034335766e-43
|
||
|
||
julia> eps(0.0)
|
||
5.0e-324</code></pre><p>The distance between two adjacent representable floating-point numbers is not constant, but is smaller for smaller values and larger for larger values. In other words, the representable floating-point numbers are densest in the real number line near zero, and grow sparser exponentially as one moves farther away from zero. By definition, <code>eps(1.0)</code> is the same as <code>eps(Float64)</code> since <code>1.0</code> is a 64-bit floating-point value.</p><p>Julia also provides the <a href="../stdlib/numbers.html#Base.nextfloat"><code>nextfloat()</code></a> and <a href="../stdlib/numbers.html#Base.prevfloat"><code>prevfloat()</code></a> functions which return the next largest or smallest representable floating-point number to the argument respectively:</p><pre><code class="language-julia-repl">julia> x = 1.25f0
|
||
1.25f0
|
||
|
||
julia> nextfloat(x)
|
||
1.2500001f0
|
||
|
||
julia> prevfloat(x)
|
||
1.2499999f0
|
||
|
||
julia> bits(prevfloat(x))
|
||
"00111111100111111111111111111111"
|
||
|
||
julia> bits(x)
|
||
"00111111101000000000000000000000"
|
||
|
||
julia> bits(nextfloat(x))
|
||
"00111111101000000000000000000001"</code></pre><p>This example highlights the general principle that the adjacent representable floating-point numbers also have adjacent binary integer representations.</p><h3><a class="nav-anchor" id="Rounding-modes-1" href="#Rounding-modes-1">Rounding modes</a></h3><p>If a number doesn't have an exact floating-point representation, it must be rounded to an appropriate representable value, however, if wanted, the manner in which this rounding is done can be changed according to the rounding modes presented in the <a href="https://en.wikipedia.org/wiki/IEEE_754-2008">IEEE 754 standard</a>.</p><pre><code class="language-julia-repl">julia> x = 1.1; y = 0.1;
|
||
|
||
julia> x + y
|
||
1.2000000000000002
|
||
|
||
julia> setrounding(Float64,RoundDown) do
|
||
x + y
|
||
end
|
||
1.2</code></pre><p>The default mode used is always <a href="../stdlib/math.html#Base.Rounding.RoundNearest"><code>RoundNearest</code></a>, which rounds to the nearest representable value, with ties rounded towards the nearest value with an even least significant bit.</p><div class="admonition warning"><div class="admonition-title">Warning</div><div class="admonition-text"><p>Rounding is generally only correct for basic arithmetic functions (<a href="../stdlib/math.html#Base.:+"><code>+()</code></a>, <a href="../stdlib/math.html#Base.:--Tuple{Any}"><code>-()</code></a>, <a href="../stdlib/strings.html#Base.:*-Tuple{AbstractString,Vararg{Any,N} where N}"><code>*()</code></a>, <a href="../stdlib/math.html#Base.:/"><code>/()</code></a> and <a href="../stdlib/math.html#Base.sqrt"><code>sqrt()</code></a>) and type conversion operations. Many other functions assume the default <a href="../stdlib/math.html#Base.Rounding.RoundNearest"><code>RoundNearest</code></a> mode is set, and can give erroneous results when operating under other rounding modes.</p></div></div><h3><a class="nav-anchor" id="Background-and-References-1" href="#Background-and-References-1">Background and References</a></h3><p>Floating-point arithmetic entails many subtleties which can be surprising to users who are unfamiliar with the low-level implementation details. However, these subtleties are described in detail in most books on scientific computation, and also in the following references:</p><ul><li><p>The definitive guide to floating point arithmetic is the <a href="http://standards.ieee.org/findstds/standard/754-2008.html">IEEE 754-2008 Standard</a>; however, it is not available for free online.</p></li><li><p>For a brief but lucid presentation of how floating-point numbers are represented, see John D. Cook's <a href="https://www.johndcook.com/blog/2009/04/06/anatomy-of-a-floating-point-number/">article</a> on the subject as well as his <a href="https://www.johndcook.com/blog/2009/04/06/numbers-are-a-leaky-abstraction/">introduction</a> to some of the issues arising from how this representation differs in behavior from the idealized abstraction of real numbers.</p></li><li><p>Also recommended is Bruce Dawson's <a href="https://randomascii.wordpress.com/2012/05/20/thats-not-normalthe-performance-of-odd-floats/">series of blog posts on floating-point numbers</a>.</p></li><li><p>For an excellent, in-depth discussion of floating-point numbers and issues of numerical accuracy encountered when computing with them, see David Goldberg's paper <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.6768&rep=rep1&type=pdf">What Every Computer Scientist Should Know About Floating-Point Arithmetic</a>.</p></li><li><p>For even more extensive documentation of the history of, rationale for, and issues with floating-point numbers, as well as discussion of many other topics in numerical computing, see the <a href="https://people.eecs.berkeley.edu/~wkahan/">collected writings</a> of <a href="https://en.wikipedia.org/wiki/William_Kahan">William Kahan</a>, commonly known as the "Father of Floating-Point". Of particular interest may be <a href="https://people.eecs.berkeley.edu/~wkahan/ieee754status/754story.html">An Interview with the Old Man of Floating-Point</a>.</p></li></ul><h2><a class="nav-anchor" id="Arbitrary-Precision-Arithmetic-1" href="#Arbitrary-Precision-Arithmetic-1">Arbitrary Precision Arithmetic</a></h2><p>To allow computations with arbitrary-precision integers and floating point numbers, Julia wraps the <a href="https://gmplib.org">GNU Multiple Precision Arithmetic Library (GMP)</a> and the <a href="http://www.mpfr.org">GNU MPFR Library</a>, respectively. The <a href="../stdlib/numbers.html#Base.GMP.BigInt"><code>BigInt</code></a> and <a href="../stdlib/numbers.html#Base.MPFR.BigFloat"><code>BigFloat</code></a> types are available in Julia for arbitrary precision integer and floating point numbers respectively.</p><p>Constructors exist to create these types from primitive numerical types, and <a href="../stdlib/numbers.html#Base.parse-Tuple{Type,Any,Any}"><code>parse()</code></a> can be used to construct them from <code>AbstractString</code>s. Once created, they participate in arithmetic with all other numeric types thanks to Julia's <a href="conversion-and-promotion.html#conversion-and-promotion-1">type promotion and conversion mechanism</a>:</p><pre><code class="language-julia-repl">julia> BigInt(typemax(Int64)) + 1
|
||
9223372036854775808
|
||
|
||
julia> parse(BigInt, "123456789012345678901234567890") + 1
|
||
123456789012345678901234567891
|
||
|
||
julia> parse(BigFloat, "1.23456789012345678901")
|
||
1.234567890123456789010000000000000000000000000000000000000000000000000000000004
|
||
|
||
julia> BigFloat(2.0^66) / 3
|
||
2.459565876494606882133333333333333333333333333333333333333333333333333333333344e+19
|
||
|
||
julia> factorial(BigInt(40))
|
||
815915283247897734345611269596115894272000000000</code></pre><p>However, type promotion between the primitive types above and <a href="../stdlib/numbers.html#Base.GMP.BigInt"><code>BigInt</code></a>/<a href="../stdlib/numbers.html#Base.MPFR.BigFloat"><code>BigFloat</code></a> is not automatic and must be explicitly stated.</p><pre><code class="language-julia-repl">julia> x = typemin(Int64)
|
||
-9223372036854775808
|
||
|
||
julia> x = x - 1
|
||
9223372036854775807
|
||
|
||
julia> typeof(x)
|
||
Int64
|
||
|
||
julia> y = BigInt(typemin(Int64))
|
||
-9223372036854775808
|
||
|
||
julia> y = y - 1
|
||
-9223372036854775809
|
||
|
||
julia> typeof(y)
|
||
BigInt</code></pre><p>The default precision (in number of bits of the significand) and rounding mode of <a href="../stdlib/numbers.html#Base.MPFR.BigFloat"><code>BigFloat</code></a> operations can be changed globally by calling <a href="../stdlib/numbers.html#Base.MPFR.setprecision"><code>setprecision()</code></a> and <a href="../stdlib/numbers.html#Base.Rounding.setrounding-Tuple{Type,Any}"><code>setrounding()</code></a>, and all further calculations will take these changes in account. Alternatively, the precision or the rounding can be changed only within the execution of a particular block of code by using the same functions with a <code>do</code> block:</p><pre><code class="language-julia-repl">julia> setrounding(BigFloat, RoundUp) do
|
||
BigFloat(1) + parse(BigFloat, "0.1")
|
||
end
|
||
1.100000000000000000000000000000000000000000000000000000000000000000000000000003
|
||
|
||
julia> setrounding(BigFloat, RoundDown) do
|
||
BigFloat(1) + parse(BigFloat, "0.1")
|
||
end
|
||
1.099999999999999999999999999999999999999999999999999999999999999999999999999986
|
||
|
||
julia> setprecision(40) do
|
||
BigFloat(1) + parse(BigFloat, "0.1")
|
||
end
|
||
1.1000000000004</code></pre><h2><a class="nav-anchor" id="man-numeric-literal-coefficients-1" href="#man-numeric-literal-coefficients-1">Numeric Literal Coefficients</a></h2><p>To make common numeric formulas and expressions clearer, Julia allows variables to be immediately preceded by a numeric literal, implying multiplication. This makes writing polynomial expressions much cleaner:</p><pre><code class="language-jldoctest">julia> x = 3
|
||
3
|
||
|
||
julia> 2x^2 - 3x + 1
|
||
10
|
||
|
||
julia> 1.5x^2 - .5x + 1
|
||
13.0</code></pre><p>It also makes writing exponential functions more elegant:</p><pre><code class="language-jldoctest">julia> 2^2x
|
||
64</code></pre><p>The precedence of numeric literal coefficients is the same as that of unary operators such as negation. So <code>2^3x</code> is parsed as <code>2^(3x)</code>, and <code>2x^3</code> is parsed as <code>2*(x^3)</code>.</p><p>Numeric literals also work as coefficients to parenthesized expressions:</p><pre><code class="language-jldoctest">julia> 2(x-1)^2 - 3(x-1) + 1
|
||
3</code></pre><div class="admonition note"><div class="admonition-title">Note</div><div class="admonition-text"><p>The precedence of numeric literal coefficients used for implicit multiplication is higher than other binary operators such as multiplication (<code>*</code>), and division (<code>/</code>, <code>\</code>, and <code>//</code>). This means, for example, that <code>1 / 2im</code> equals <code>-0.5im</code> and <code>6 // 2(2 + 1)</code> equals <code>1 // 1</code>.</p></div></div><p>Additionally, parenthesized expressions can be used as coefficients to variables, implying multiplication of the expression by the variable:</p><pre><code class="language-jldoctest">julia> (x-1)x
|
||
6</code></pre><p>Neither juxtaposition of two parenthesized expressions, nor placing a variable before a parenthesized expression, however, can be used to imply multiplication:</p><pre><code class="language-jldoctest">julia> (x-1)(x+1)
|
||
ERROR: MethodError: objects of type Int64 are not callable
|
||
|
||
julia> x(x+1)
|
||
ERROR: MethodError: objects of type Int64 are not callable</code></pre><p>Both expressions are interpreted as function application: any expression that is not a numeric literal, when immediately followed by a parenthetical, is interpreted as a function applied to the values in parentheses (see <a href="faq.html#Functions-1">Functions</a> for more about functions). Thus, in both of these cases, an error occurs since the left-hand value is not a function.</p><p>The above syntactic enhancements significantly reduce the visual noise incurred when writing common mathematical formulae. Note that no whitespace may come between a numeric literal coefficient and the identifier or parenthesized expression which it multiplies.</p><h3><a class="nav-anchor" id="Syntax-Conflicts-1" href="#Syntax-Conflicts-1">Syntax Conflicts</a></h3><p>Juxtaposed literal coefficient syntax may conflict with two numeric literal syntaxes: hexadecimal integer literals and engineering notation for floating-point literals. Here are some situations where syntactic conflicts arise:</p><ul><li><p>The hexadecimal integer literal expression <code>0xff</code> could be interpreted as the numeric literal <code>0</code> multiplied by the variable <code>xff</code>.</p></li><li><p>The floating-point literal expression <code>1e10</code> could be interpreted as the numeric literal <code>1</code> multiplied by the variable <code>e10</code>, and similarly with the equivalent <code>E</code> form.</p></li></ul><p>In both cases, we resolve the ambiguity in favor of interpretation as a numeric literals:</p><ul><li><p>Expressions starting with <code>0x</code> are always hexadecimal literals.</p></li><li><p>Expressions starting with a numeric literal followed by <code>e</code> or <code>E</code> are always floating-point literals.</p></li></ul><h2><a class="nav-anchor" id="Literal-zero-and-one-1" href="#Literal-zero-and-one-1">Literal zero and one</a></h2><p>Julia provides functions which return literal 0 and 1 corresponding to a specified type or the type of a given variable.</p><table><tr><th>Function</th><th>Description</th></tr><tr><td><a href="../stdlib/numbers.html#Base.zero"><code>zero(x)</code></a></td><td>Literal zero of type <code>x</code> or type of variable <code>x</code></td></tr><tr><td><a href="../stdlib/numbers.html#Base.one"><code>one(x)</code></a></td><td>Literal one of type <code>x</code> or type of variable <code>x</code></td></tr></table><p>These functions are useful in <a href="mathematical-operations.html#Numeric-Comparisons-1">Numeric Comparisons</a> to avoid overhead from unnecessary <a href="conversion-and-promotion.html#conversion-and-promotion-1">type conversion</a>.</p><p>Examples:</p><pre><code class="language-julia-repl">julia> zero(Float32)
|
||
0.0f0
|
||
|
||
julia> zero(1.0)
|
||
0.0
|
||
|
||
julia> one(Int32)
|
||
1
|
||
|
||
julia> one(BigFloat)
|
||
1.000000000000000000000000000000000000000000000000000000000000000000000000000000</code></pre><footer><hr/><a class="previous" href="variables.html"><span class="direction">Previous</span><span class="title">Variables</span></a><a class="next" href="mathematical-operations.html"><span class="direction">Next</span><span class="title">Mathematical Operations and Elementary Functions</span></a></footer></article></body></html>
|