WordPress Security

Last year, WordPress was responsible for 83% of infected content management sites. Make sure you’re not contributing to those infections and learn how to securely manage WordPress.

(This article is kindly sponsored by Sucuri.) WordPress security doesn’t have a good reputation. More than 70% of all WordPress sites carry some kind of vulnerability according to research done on +40.000 WordPress sites by Alexa. If you develop WordPress themes or plugins — or use WordPress for your websites — that number should scare you.

There’s a lot you can do to make sure you’re not part of the 70%, but it takes more work than just installing a plugin or escaping a string. A lot of advice in this article comes from Sucuri’s guide on WordPress security and years of personal experience.

Is WordPress Insecure?

WordPress has the largest market share among content management systems and a 30% market share among the most popular 10 million sites on the web. That kind of success makes it a big target for hacks. WordPress isn’t less secure than other content management systems — it’s just more successful.

Vulnerabilities in WordPress core are responsible for less than 10% of all WordPress hacks. Most of those are from out-of-date WordPress installs. The amount of hacks that happen on actual security holes in up-to-date versions (also known as zero-day exploits) in WordPress core account for a tiny percentage of all hacks.

The rest of the infected sites were caused by plugins, themes, hosting, and users. And you, as a WordPress website developer, have control over all of those. If this seems like a big hassle to you, then I can recommend Sucuri’s agency plan. Otherwise, let’s find out how to deal with WordPress security ourselves!

Who’s Attacking You And Why?

Let’s bust a myth first: A small WordPress website is still an attractive target for hackers. Attacks on a personal basis are very rare. Most hacked WordPress websites are compromised automatically by either a bot or a botnet.

Bots are computer programs that constantly search for websites to hack. They don’t care who you are; they just look for a weakness in your defences. A botnet combines the computing power of many bots to tackle bigger tasks.

Hackers are primarily looking for a way into your server so that they can use your server’s computing power and turn it loose on some other goal or target. Hackers want your server for the following reasons.

Sending Spam

Spam accounts for about 60% of all email, and it has to be sent from somewhere. Many hackers want to gain entry to your server through a faulty plugin or an ancient version of WordPress core so that they can turn your server into a spamming machine.

Attacking Other Websites

Distributed denial-of-service attacks use many computers to flood a website with so much traffic that they can’t keep up. These attacks are very difficult to mitigate, especially when they are done right. Hackers who break into your server can add it to a pool of servers to attack websites.

Stealing Resources

Mining cryptocurrency is very popular now, but it takes a lot of computing power. Hackers who don’t want to spend a lot of money on a server farm will break into unprotected WordPress websites and gain access to servers or to your websites’ visitors and steal computing power.

Bumping SEO Scores

A particularly popular hack for WordPress is to gain access to its database and add a bunch of (hidden) text underneath each post, linking to another website. It’s a really quick way to bump one’s SEO score, although Google is getting more vigilant about this behavior, and blacklistings are increasing.

Stealing Data

Data is valuable, especially when it’s linked to user profiles and e-commerce information. Getting this data and selling it can make an attacker a handsome profit.

Why Does Security Matter?

Apart from not giving criminals the satisfaction, there are plenty of reasons why your website should be secure by default. Having cleaned and dealt with plenty of WordPress hacks myself, I can surely say that they never occur at a convenient time. Cleaning up can take hours and will cost either you or your client money.

To get a hacked WordPress website up and running again, you’ll need to remove and replace every bit of third-party code (including WordPress core); comb through your own code line by line and all other folders on the server to make sure they are still clean; check whether unauthorized users have gained access; and replace all passwords in WordPress, on your server and on your database.

Plenty of services can clean up a WordPress website for you, but prevention is so much better in the long run.

Apart from the cost of cleaning up, hacks can also cost you a lot in missed sales or leads. Hacks move you lower in search rankings, resulting in fewer visitors and fewer conversions.

More than the financial cost, getting hacked hurts your reputation. Visitors come to your website because they trust you. Getting hacked damages your reputation, and that takes a long time to repair.

There’s also a real possibility of legal issues, especially if you have customers in the EU, where GDPR legislation will go into effect in the summer of 2018. That new legislation includes a hefty fine for data breaches that aren’t handled properly.

Money, reputation and legal problems: Bad security can cost you a lot. Investing some time in getting your website, code and team set up with a mindset of security will definitely pay off.

Let’s find out how we can prevent all of this nastiness.

The CIA Triad

The CIA triad is a basic framework for every digital security project. It stands for confidentiality, integrity and availability. CIA is a set of rules that limits information access to the right parties, makes sure the information is trustworthy and accurate, and guarantees reliable access to that information.

For WordPress, the CIA framework boils down to the following.

Confidentiality

Make sure logged-in users have the right roles assigned and that their capabilities are kept in check. Only give users the minimum access they need, and make sure that administrator information doesn’t leak out to the wrong party. You can do so by hardening WordPress’ admin area and being careful with usernames and credentials.

Integrity

Show accurate information on your website, and make sure that user interactions on your website happen correctly.

When accepting requests on both the front and back end, always check that the intent matches the actual action. When data is posted, always filter the data in your code for malicious content by using sanitization and escapes. Make sure spam gets removed by using a spam protection service such as Akismet.

Availability

Mak sure your WordPress, plugins and themes are up to date and hosted on a reliable (preferably managed) WordPress host. Daily automated backups also help to ensure that your website stays available to the public.

All three elements lean on each other for support. Code integrity will not work on its own if a user’s confidential password is easily stolen or guessed. All aspects are important to a solid and secure platform.

Security is a lot of hard work. Apart from the work that can be done in code, there’s a huge human element to this framework. Security is a constant process; it can’t be solved by a single plugin.

Part 1: Integrity — Trust Nothing

Verify the intent of user actions and the integrity of the data you’re handling. Throw your inner hippie out the door. Nothing can be trusted online, so double-check everything you do for possible malicious intent.

Data Validation and Sanitization

WordPress is excellent at handling data. It makes sure that every interaction is validated and that every bit of data is sanitized, but that’s only in WordPress core. If you’re building your own plugin or theme or even just checking a piece of third-party code, knowing how to do this is essential.

//Cast our variable to a string, and sanitize it.
update_post_meta( $post->ID,  ‘some-meta’,  sanitize_text_field( (string)$_POST[‘some-meta’] ) );

//Make sure our variable is an absolute integer.
update_post_meta( $post->ID, ‘some-int’, absint( $_POST[‘int’] ) );

In this example, we’ve added two pieces of data to a WordPress post using update_post_meta. The first is a string; so, we cast it as a string in PHP and strip unwanted characters and tags with sanitize_text_field, one of WordPress’ many sanitization functions.

We’ve also added an integer to that post and used absint to make sure this is an absolute (and non-negative) integer.

Using core WordPress functions such as update_post_meta is a better idea than using the WordPress database directly. This is because WordPress checks everything that needs to be stored in the database for so-called SQL injections. A SQL injection attack runs malicious SQL code through the forms on your website. This code manipulates the database to, for instance, destroy everything, leak user data or create false administrator accounts.

If you ever need to work with a custom table or perform a complicated query in WordPress, use the native WPDB class, and use the prepare function on all your queries to prevent SQL injection attacks:

$tableName = $wpdb->prefix . “my_table”;
$sql = $wpdb->prepare(SELECT * FROM %s”, $tableName );
$results = $wpdb->get_results( $sql );

$wpdb->prepare goes through every variable to make sure there’s no chance of a SQL injection attack.

Escaping

Escaping output is just as important as sanitizing input. Validating data before you save it is important, but you can’t be 100% sure it’s still safe. Trust nothing. WordPress uses a lot of filters to enable plugins and themes to change data on the fly, so there’s a good chance that your data will get parsed through other plugins as well. Escaping data before adding it to your theme or plugin is a smart thing to do.

Escaping is mainly meant to prevent cross-site scripting (XSS) attacks. XSS attacks inject malicious code into the front end of your website. An added bonus of escaping data is that you can be sure that your markup is still valid afterwards.

WordPress has many escaping functions. Here’s a simple example:

<a href=<?php echo esc_url( $url );?> title=<?php echo esc_attr( $title );?>><?php echo esc_html( $title );?></a>

Escape as late as possible. This ensures that you have the final say over your data.

Securing Requests

WordPress admin requests are already pretty secure if you have SSL enabled and if you have a decent host, but some vulnerabilities still exist. You need to check a user’s intent and validate that the incoming request is something that was done by the actual logged-in user.

WordPress validates intent with nonces.. A nonce (or “number used only once”) isn’t really an accurate description of this API in WordPress. It doesn’t only use numbers, and it is much more like a cross-site request forgery (CSRF) token that you’ll find in every modern web framework. These tokens make sure hackers can’t repeat requests. It’s a lot more than just a nonce, but WordPress likes backwards-compatibility, so the name stuck.

Nonces are sent along with every vulnerable request that a user makes. They’re attached to URLs and forms, and they always need to be checked on the receiving end before performing the request. You can add a nonce to a form or a URL. Here’s an example used in a form:

<form method= “post”>
<!-- Add a nonce field: -->
<?php wp_nonce_field( ‘post_custom_form’ );?>

<!-- other fields: →
...
</form>

In this case, we’re just using the simple helper function wp_nonce_field(), which generates two hidden fields for us that will look like this:

<input type="hidden" id="_wpnonce" name="_wpnonce" value="e558d2674e" />

<input type="hidden" name="_wp_http_referer" value="/wp-admin/post.php?post=2&action=edit" />

The first field checks intent by using a generated code with the 'post_custom_form' string that we’ve passed to the function. The second field adds a referrer to validate whether the request was made from within the WordPress installation.

Before processing your task on the other end of the form or URL, you would check the nonce and its validity with wp_verify_nonce:

if( wp_verify_nonce( $_REQUEST[‘_wpnonce’], ‘post_custom_form’ ) == false ){
    wp_die( “Nonce isn\’t valid” );
}

Here, we’re checking the nonce with our action name, and if it doesn’t match, we stop processing the form.

Third-Party Code

Third-party plugins and themes are a hotbed for hacks. They’re also the toughest nut to crack when ensuring the security of your website.

Most WordPress hacks are caused by plugins, themes, and out-of-date copies of WordPress. No piece of software is 100% secure, but a lot of plugins and themes out there either haven’t been updated in a while by their developers or weren’t secure to begin with.

Less code means less to hack. So, before installing yet another plugin, ask yourself whether you really need it. Is there another way to solve this problem?

If you’re sure you need a plugin or theme, then judge it carefully. Look at the rating, the “last updated” date and the required PHP version when browsing through WordPress’ plugin directory. If you’ve found what you’re looking for and everything seems to work, search for any mentions of it on a trusted security blog, such as Sucuri or WordFence.

Another option is to scan the code and make sure it contains proper nonces, sanitation and escaping; these are usually signs of well-written and secure code. You don’t have to know PHP or do a complete code review. A simple and quick way to verify proper use of WordPress security functions is to search the plugin’s code for these strings:

  • esc_attr
  • esc_html
  • wp_nonce_field
  • wp_nonce_url
  • sanitize_text_field
  • $wpdb->prepare

A plugin could still be secure if it doesn’t include all of these strings, but if none or a low number of these strings are found, that is a red flag. If you do find a vulnerability, please share it with the creator in private, and allow them time to fix it.

Keeping track of vulnerabilities in the WordPress plugin space is getting easier with initiatives such as wpvulndb.

Note: Some themes out there bundle versions of plugins with their code. This is a symptom of WordPress not having great out-of-the-box dependency management, but it’s also a sign of a very poorly written theme. Always avoid these themes because they include code bases that can’t be updated.

Themes and plugins rarely contain code written by only one developer. Composer and NPM have made it so much easier to depend on other libraries that it’s become a popular attack vector. If you’re downloading a cut-and-dry WordPress theme or plugin, this really isn’t a concern, but if you’re working with tools that use Composer or NPM, then it doesn’t hurt to check their dependencies. You can check Composer dependencies with a free command-line interface (CLI) tool by SensioLabs. A service such as Snyk (which you can use for free but which also has premium options) enables you to check every dependency in your project.

Part 2: Availability: Keep It Simple

Your main goal is to keep your website online without interruptions. Even with top-notch security, you can still get in trouble. When that happens, a great backup will save you a big headache.

Updates

Open-source can’t exist without updates. Most attacks on WordPress websites happen on outdated versions of either the core software or plugins. Security updates to WordPress’ core are now dealt with automatically (unless you’ve disabled this, you monster!), but security updates in plugins are a different story.

Updating is normally safe with popular, trusted plugins, but all plugins should be tested before they go live on your website. Tools such as WP CLI make updating everything much easier. WordPress lead developer Mark Jaquith had an excellent blog post on updating all plugins automatically yet gradually, so that you can filter out possible errors.

Users, Roles and Capabilities

“Availability” in the CIA triad has to do with getting information in the right hands. Our main priority with this is limiting the capabilities of your back-end users. Don’t give everyone an admin account.

The admin account in WordPress is unusually powerful. There’s even an option in vanilla WordPress to alter your complete code base from within the WordPress admin account. (If this is new to you and you haven’t disabled this, please do.)

The roles and capabilities system in WordPress is powerful and is very easy to alter in code. I create a lot of new roles when working with WordPress. The main benefit of this is that you get full control over which parts of the system various users get to access, but another huge benefit is that it prevents third-party code from altering the standard capabilities of WordPress core.

Email

WordPress usually handles email via the server it’s on, but this makes all of your email completely dependent on the server it’s running on. Prevent your emails from getting intercepted and seen as spam by using an SMTP service. A lot of plugin options are available to make sure that all of your mail is sent over a secure SMTP connection.

You will, however, need access to the domain name’s DNS settings to add a Sender Policy Framework (SPF) record. All good SMTP services will provide the exact record that needs to be added. An SPF record ensures that your SMTP service is authorized by the domain to send email in its name.

Monitoring

Monitoring your website online is a 24/7 task that can be fully automated. In the case of WordPress, we’re interested in uptime and file integrity.

Monitoring uptime is usually something a good host will do for you. Tools such as Uptime Robot add even more security. Your first 50 websites are completely free.

Regarding file integrity, if a hacker gains access to your server they can change your code.

In this case, plugins are the answer to your problem. Sucuri has a great auditing plugin. It checks all files in your installation against a vast database of known malicious code. It also checks whether WordPress core is still 100% WordPress core, and it gives you a heads up if there’s been a breach, so that you can fix it as soon as possible.

Backups

The ultimate fail-safe of every security process is automated backups. Most good hosts will do this for you, but there are other good options if your host doesn’t offer backups. Automattic makes one named VaultPress, and tools such as BackupBuddy back up to a Dropbox account or an Amazon S3 bucket.

Most of the reliable services in the WordPress backup space are either premium services or premium plugins. Depending on whether you need to fully control your data, you might prefer a plugin that comes with a cloud host, instead of a service. Either one is worth every penny, though.

Hosting

WordPress isn’t the only piece of software running on your server. Plenty of attack vectors are open when you’re on crappy hosting. In fact, bad hosting is the main reason why WordPress still supports outdated versions of PHP. At time of writing, WordPress’ own statistics page reports that 32.5% of all WordPress installations are running on PHP versions that do not receive security updates anymore.

PHP versions in WordPress as of 10 May 2018. (View large version)

Note the almost 60% of installations running on PHP 5.6 and 7.0, which will receive security patches only until the end of this year.

Hosting is important not only for keeping your server’s software up to date, though. A good host will offer many more services, such as automated daily backups, automated updates, file-integrity monitoring and email security. There’s a big difference between managed WordPress hosts and hosts that give you an online folder with database access.

The best advice is to find a decent managed WordPress host. They cost a little more, but they provide a great backbone for your WordPress website.

Part 3: Confidentiality

If you’ve made sure that your code base is as secure as it can be and you’re on a great WordPress host, surrounded by malware scanners and backups, then you’re still going to experience security problems, because people are the worst… at Internet security.

Confidentiality is about educating yourself, your client and the users of the website.

Confidential Data

You might not know it, but your plugins and themes are probably showing valuable confidential data. If, for instance, you have WP_DEBUG set to true, then you’re showing every hacker your website’s root path on the server. Debugging data should have no place in your production website.

Another valuable data source are comments and author pages. These are filled with usernames and even email addresses. A hacker could use these in combination with a weak password to get into your website. Be wary of what you show the outside world.

Also, double-check that you’ve put wp-config.php in your .gitignore.

Don’t Code Alone

A way to prevent a lot of mistakes from sneaking into your code base is to practice pair programming. If you’re by yourself, this a lot harder, but many online communities are available that are willing to do quick code audits. WordPress for instance, uses Slack to communicate everything about the development of its platform. You will find a lot of people on there who are willing to help. Slower but better alternatives are the WordPress forums, StackOverflow and GitHub Issues, where your questions (and their answers!) are saved so that other people can benefit from them.

Asking for input can be tough, but people love showing their expertise, and WordPress in general has a very open and welcoming community. The point is that if you never ask for input on the quality of your code, then you will have no idea whether your code is secure.

Logins and Passwords

Your clients will need to log into WordPress to manage their content. WordPress core does what it can to prevent weak passwords from getting through, but this usually isn’t enough.

I’d recommend adding a plugin for two-factor authentication to your website, along with a limit on login attempts. Even better, do away with passwords entirely and work with magic links.

Trust But Verify

So far in this article, we haven’t talked about social engineering at all. It’s a form of hacking that’s gaining momentum, but it generally isn’t used to hack into WordPress websites. It is, however, an excellent way to set up the culture around your website with security in mind. That’s because the best defense against social engineering is “Trust but verify”.

Whenever a client, a user or your boss asks for something related to security, the best way to deal with it is to trust but first to verify whether what they are saying is true.

A client can claim they need administrator access to WordPress, but your job is to verify whether this is true. Do they actually need access, or are they missing just a single capability in their role? Is there a way to solve this problem without adding possibly new attack vectors?

“Trust but verify” is a simple yet effective mantra when it comes to security questions, and it can really help get people up to speed.

Conclusion

Is WordPress insecure? No, it’s not. WordPress core is constantly being updated and fixed, and most reported WordPress hacks aren’t from WordPress itself. Is the culture surrounding WordPress insecure? You betcha!

But by having security in mind with every line of code you write, every user you add, every plugin you enable and every hosting bill you pay, you can at least ensure that you’re running a secure website that keeps your reputation intact and your data safe.

Links:

Akin’s Laws of Spacecraft Design

1. Engineering is done with numbers. Analysis without numbers is only an opinion.

2. To design a spacecraft right takes an infinite amount of effort. This is why it’s a good idea to design them to operate when some things are
wrong.

3. Design is an iterative process. The necessary number of iterations is one more than the number you have currently done. This is true at any point in time.

4. Your best design efforts will inevitably wind up being useless in the final design. Learn to live with the disappointment.

5. (Miller’s Law) Three points determine a curve.

6. (Mar’s Law) Everything is linear if plotted log-log with a fat magic marker.

7. At the start of any design effort, the person who most wants to be team leader is least likely to be capable of it.

8. In nature, the optimum is almost always in the middle somewhere. Distrust assertions that the optimum is at an extreme point.

9. Not having all the information you need is never a satisfactory excuse for not starting the analysis.

10. When in doubt, estimate. In an emergency, guess. But be sure to go back and clean up the mess when the real numbers come along.

11. Sometimes, the fastest way to get to the end is to throw everything out and start over.

12. There is never a single right solution. There are always multiple wrong ones, though.

13. Design is based on requirements. There’s no justification for designing something one bit “better” than the requirements dictate.

14. (Edison’s Law) “Better” is the enemy of “good”.

15. (Shea’s Law) The ability to improve a design occurs primarily at the interfaces. This is also the prime location for screwing it up.

16. The previous people who did a similar analysis did not have a direct pipeline to the wisdom of the ages. There is therefore no reason to
believe their analysis over yours. There is especially no reason to present their analysis as yours.

17. The fact that an analysis appears in print has no relationship to the likelihood of its being correct.

18. Past experience is excellent for providing a reality check. Too much reality can doom an otherwise worthwhile design, though.

19. The odds are greatly against you being immensely smarter than everyone else in the field. If your analysis says your terminal velocity
is twice the speed of light, you may have invented warp drive, but the chances are a lot better that you’ve screwed up.

20. A bad design with a good presentation is doomed eventually. A good design with a bad presentation is doomed immediately.

21. (Larrabee’s Law) Half of everything you hear in a classroom is crap. Education is figuring out which half is which.

22. When in doubt, document. (Documentation requirements will reach a maximum shortly after the termination of a program.)

23. The schedule you develop will seem like a complete work of fiction up until the time your customer fires you for not meeting it.

24. It’s called a “Work Breakdown Structure” because the Work remaining will grow until you have a Breakdown, unless you enforce
some Structure on it.

25. (Bowden’s Law) Following a testing failure, it’s always possible to refine the analysis to show that you really had negative margins all along.

26. (Montemerlo’s Law) Don’t do nuthin’ dumb.

27. (Varsi’s Law) Schedules only move in one direction.

28. (Ranger’s Law) There ain’t no such thing as a free launch.

29. (von Tiesenhausen’s Law of Program Management) To get an accurate estimate of final program requirements, multiply the initial time estimates by pi, and slide the decimal point on the cost estimates one place to the right.

30. (von Tiesenhausen’s Law of Engineering Design) If you want to have a maximum effect on the design of a new engineering system, learn to draw. Engineers always wind up designing the vehicle to look like the initial artist’s concept.

31. (Mo’s Law of Evolutionary Development) You can’t get to the moon by climbing successively taller trees.

32. (Atkin’s Law of Demonstrations) When the hardware is working perfectly, the really important visitors don’t show up.

33. (Patton’s Law of Program Planning) A good plan violently executed now is better than a perfect plan next week.

34. (Roosevelt’s Law of Task Planning) Do what you can, where you are, with what you have.

35. (de Saint-Exupery’s Law of Design) A designer knows that he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.

36. Any run-of-the-mill engineer can design something which is elegant. A good engineer designs systems to be efficient. A great
engineer designs them to be effective.

37. (Henshaw’s Law) One key to success in a mission is establishing clear lines of blame.

38. Capabilities drive requirements, regardless of what the systems engineering textbooks say.

39. Any exploration program which “just happens” to include a new launch vehicle is, de facto, a launch vehicle program.

39. (alternate formulation) The three keys to keeping a new human space program affordable and on schedule:
1)  No new launch vehicles.
2)  No new launch vehicles.
3)  Whatever you do, don’t develop any new launch vehicles.

40. (McBryan’s Law) You can’t make it better until you make it work.

41. There’s never enough time to do it right, but somehow, there’s always enough time to do it over.

42. Space is a completely unforgiving environment. If you screw up the engineering, somebody dies (and there’s no partial credit because most of the analysis was right…)

*I’ve been involved in spacecraft and space systems design and development for my entire career, including teaching the senior-level capstone
spacecraft design course, for ten years at MIT and now at the University of Maryland for more than two decades. These are some bits of wisdom that I have gleaned
during that time, some by picking up on the experience of others, but mostly by screwing up myself. I originally wrote these up and handed them out to my
senior design class, as a strong hint on how best to survive my design experience. Months later, I get a phone call from a friend in California complimenting me
on the Laws, which he saw on a “joke-of-the-day” listserve. Since then, I’m aware of half a dozen sites around the world that present various
editions of the Laws, and even one site which has converted them (without attribution, of course) to the Laws of Certified Public Accounting. (Don’t ask…) Anyone is welcome to link to
these, use them, post them, send me suggestions of additional laws, but I do maintain that this is the canonical set of Akin’s Laws…

Simplicity vs Complexity

Complexity eats a lot of time and resources.
The amount of time and resources increases exponentially with the complexity of the system.
Complexity can be measured by lines of code, levels of hierarchy and amount of tools used.
Linear is much better than nested.
Simpler system takes much less brain power to understand the logic what also increases developer efficiency and reduces amount of issues due to being less tired and more possible issues.
Simpler is much easier to debug and to maintain and to extend.

Flexbox responsive columns

html:

<div class="row">
  <div class="col-6 col-sm-12">
    Left column. 50% on large screens. 100% on smaller screens.
  </div>
  <div class="col-6 col-sm-12 col-right">
    Right column. 50% on large screens. 100% on smaller screens. Has different background for visual difference.
  </div>
</div>

css:

.row {
  display: flex;
  flex-wrap: wrap;
}

.col-6 {
  flex: 0 0 50%;
  background-color: #eee;
}

.col-right {
  background-color: #dde;
}

@media (max-width: 576px) {
  .col-sm-12 {
    flex-basis: 100%;
  }
}

Floating Point Arithmetic

Floating-point numbers are represented in computer hardware as base 2 (binary)
fractions. For example, the decimal fraction

0.125

has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction

0.001

has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only
real difference being that the first is written in base 10 fractional notation,
and the second in base 2.

Unfortunately, most decimal fractions cannot be represented exactly as binary
fractions. A consequence is that, in general, the decimal floating-point
numbers you enter are only approximated by the binary floating-point numbers
actually stored in the machine.

The problem is easier to understand at first in base 10. Consider the fraction
1/3. You can approximate that as a base 10 fraction:

0.3

or, better,

0.33

or, better,

0.333

and so on. No matter how many digits you’re willing to write down, the result
will never be exactly 1/3, but will be an increasingly better approximation of
1/3.

In the same way, no matter how many base 2 digits you’re willing to use, the
decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base
2, 1/10 is the infinitely repeating fraction

0.0001100110011001100110011001100110011001100110011...

Stop at any finite number of bits, and you get an approximation. On most
machines today, floats are approximated using a binary fraction with
the numerator using the first 53 bits starting with the most significant bit and
with the denominator as a power of two. In the case of 1/10, the binary fraction
is 3602879701896397 / 2 ** 55 which is close to but not exactly
equal to the true value of 1/10.

Many users are not aware of the approximation because of the way values are
displayed. Python only prints a decimal approximation to the true decimal
value of the binary approximation stored by the machine. On most machines, if
Python were to print the true decimal value of the binary approximation stored
for 0.1, it would have to display

>>>

>>> 0.1
0.1000000000000000055511151231257827021181583404541015625

That is more digits than most people find useful, so Python keeps the number
of digits manageable by displaying a rounded value instead

>>>

>>> 1 / 10
0.1

Just remember, even though the printed result looks like the exact value
of 1/10, the actual stored value is the nearest representable binary fraction.

Interestingly, there are many different decimal numbers that share the same
nearest approximate binary fraction. For example, the numbers 0.1 and
0.10000000000000001 and
0.1000000000000000055511151231257827021181583404541015625 are all
approximated by 3602879701896397 / 2 ** 55. Since all of these decimal
values share the same approximation, any one of them could be displayed
while still preserving the invariant eval(repr(x)) == x.

Historically, the Python prompt and built-in repr() function would choose
the one with 17 significant digits, 0.10000000000000001. Starting with
Python 3.1, Python (on most systems) is now able to choose the shortest of
these and simply display 0.1.

Note that this is in the very nature of binary floating-point: this is not a bug
in Python, and it is not a bug in your code either. You’ll see the same kind of
thing in all languages that support your hardware’s floating-point arithmetic
(although some languages may not display the difference by default, or in all
output modes).

For more pleasant output, you may wish to use string formatting to produce a limited number of significant digits:

>>>

>>> format(math.pi, '.12g')  # give 12 significant digits
'3.14159265359'

>>> format(math.pi, '.2f')   # give 2 digits after the point
'3.14'

>>> repr(math.pi)
'3.141592653589793'

It’s important to realize that this is, in a real sense, an illusion: you’re
simply rounding the display of the true machine value.

One illusion may beget another. For example, since 0.1 is not exactly 1/10,
summing three values of 0.1 may not yield exactly 0.3, either:

>>>

>>> .1 + .1 + .1 == .3
False

Also, since the 0.1 cannot get any closer to the exact value of 1/10 and
0.3 cannot get any closer to the exact value of 3/10, then pre-rounding with
round() function cannot help:

>>>

>>> round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1)
False

Though the numbers cannot be made closer to their intended exact values,
the round() function can be useful for post-rounding so that results
with inexact values become comparable to one another:

>>>

>>> round(.1 + .1 + .1, 10) == round(.3, 10)
True

Binary floating-point arithmetic holds many surprises like this. The problem
with “0.1” is explained in precise detail below, in the “Representation Error”
section. See The Perils of Floating Point
for a more complete account of other common surprises.

As that says near the end, “there are no easy answers.” Still, don’t be unduly
wary of floating-point! The errors in Python float operations are inherited
from the floating-point hardware, and on most machines are on the order of no
more than 1 part in 2**53 per operation. That’s more than adequate for most
tasks, but you do need to keep in mind that it’s not decimal arithmetic and
that every float operation can suffer a new rounding error.

While pathological cases do exist, for most casual use of floating-point
arithmetic you’ll see the result you expect in the end if you simply round the
display of your final results to the number of decimal digits you expect.
str() usually suffices, and for finer control see the str.format()
method’s format specifiers in Format String Syntax.

For use cases which require exact decimal representation, try using the
decimal module which implements decimal arithmetic suitable for
accounting applications and high-precision applications.

Another form of exact arithmetic is supported by the fractions module
which implements arithmetic based on rational numbers (so the numbers like
1/3 can be represented exactly).

If you are a heavy user of floating point operations you should take a look
at the Numerical Python package and many other packages for mathematical and
statistical operations supplied by the SciPy project. See <https://scipy.org>.

Python provides tools that may help on those rare occasions when you really
do want to know the exact value of a float. The
float.as_integer_ratio() method expresses the value of a float as a
fraction:

>>>

>>> x = 3.14159
>>> x.as_integer_ratio()
(3537115888337719, 1125899906842624)

Since the ratio is exact, it can be used to losslessly recreate the
original value:

>>>

>>> x == 3537115888337719 / 1125899906842624
True

The float.hex() method expresses a float in hexadecimal (base
16), again giving the exact value stored by your computer:

>>>

>>> x.hex()
'0x1.921f9f01b866ep+1'

This precise hexadecimal representation can be used to reconstruct
the float value exactly:

>>>

>>> x == float.fromhex('0x1.921f9f01b866ep+1')
True

Since the representation is exact, it is useful for reliably porting values
across different versions of Python (platform independence) and exchanging
data with other languages that support the same format (such as Java and C99).

Another helpful tool is the math.fsum() function which helps mitigate
loss-of-precision during summation. It tracks “lost digits” as values are
added onto a running total. That can make a difference in overall accuracy
so that the errors do not accumulate to the point where they affect the
final total:

>>>

>>> sum([0.1] * 10) == 1.0
False
>>> math.fsum([0.1] * 10) == 1.0
True

Web-Dev Notes

DRY is often misinterpreted as the necessity to never repeat the exact same thing twice. This is impractical and usually counterproductive, and can lead to forced abstractions, over-thought and over-engineered code.Harry Roberts

DRY, SRP, Modularity etc is not a ultimate goal or strict rule. It is just a principle and recommendation.