Sunday, August 26, 2018

7 steps to better code reviews Code reviews make better software, better developers, and better teams. Follow these steps to getting them right

Source: https://www.infoworld.com/article/3297940/application-development/7-steps-to-better-code-reviews.html

In a field like software development that demands attention to detail, peer review is essential. When the slightest mistake can cause serious errors throughout the project, another set of eyes (or several) will help ensure that everything reaches its full potential. While there are automated tests you can perform to vet your code, nothing beats the human touch.
Code review had been demonstrated to significantly speed up the development process. But what are the responsibilities of the code reviewer? When running a code review, how do you ensure constructive feedback? How do you solicit input that will expedite and improve the project? Here are a few tips for running a solid code review.
  1. Establish goals. Code reviews are more than just finding errors and bugs.You may be thinking about adding new features and how to implement them. You may be trying to ensure that the code meets certain style standards established by your organization. Whatever the goals are, it’s important that you make them clear at the very beginning of the process, and that everyone on the team understands and works towards them. If each team member has a different goal or viewpoint, it will be difficult to reach a consensus and make progress.
  2. Do your first pass. Try to get to the initial pass as soon as possible after you receive the request. You don’t have to go into depth just yet. Just do a quick overview and have your team write down their first impressions and thoughts.
  3. Use a ticketing system. Most software development platforms facilitate comments and discussion on different aspects of the code. Every proposed change to the code is a new ticket. As soon as any team member sees a change that needs to be made, they create a ticket for it. The ticket should describe what the change is, where it would go, and why it’s necessary. Then the others on your team can review the ticket and add their own comments. Not only will this system help you keep track of all proposed changes, but the discussion will lead to further improvement and refinement of the overall code.
  4. Run tests. You can try to spot tiny errors by looking at line after line of code, but it’s often easier to run the piece of code in question and see how it works. In doing so, it’s easier to find bugs in the context of how they affect your application. It can also provide insight into what features are missing or could be improved.
  5. Test proposed changes. Put the code into your testing environment and see how it functions with the proposed changes. Do the changes work? Has the software improved, or have the changes caused more problems? Do these changes work for the project’s overall budget? What still needs to be done? Create more tickets for discussion, based on the tests.
  6. Do your in-depth pass. Now it’s time to sift through the lines of code with a fine-toothed comb and find the bugs, the style issues, the misplaced parentheses, etc. Some people prefer to do this before testing the proposed changes from the first pass. They’ll wait until the end and then test all the changes at once. But testing the changes from your first pass can help inform your second pass. Plus, testing as you go can save you time and money, as opposed to saving all of your testing to the end.
  7. Submit the evaluation. Minor changes such as coding errors and typos can be fixed as you go along. But major changes should always be discussed with the code’s author first. Ask yourself, is the change you’re proposing really a problem, or just something that you would have done differently? Because in the end, it’s their code, not yours. Once you’ve submitted your evaluation of the code, talk to the author and find out why they did things a certain way. Then tell them your approach and see what they think. Hopefully, you’ll be able to see things from each other’s point of view and use those insights to make the code the best that it can be.
A code review is one of the most important aspects of programming. It allows you to address problems more quickly and efficiently, and ultimately deliver higher-quality code and a better software product. How will you make the best use of code reviews in your next project?
Rob Whitcomb is senior software engineer at Surge. He has been building enterprise applications in a multitude of technologies for a decade. Surge is a company of Catalyte
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Thursday, July 19, 2018

What’s new in the Anaconda distribution for Python Anaconda 5.2 adds job scheduling, support for GPUs, and integration with version control systems including Git and GitHub


Source: https://www.infoworld.com/article/3235218/python/whats-new-in-the-anaconda-distribution-for-python.html

Anaconda, the Python language distribution and work environment for scientific computing, data science, statistical analysis, and machine learning, is now available in version 5.2, with additions to both its enterprise and open-source community editions.


Where to download Anaconda 5.2

The community edition of Anaconda Distribution is available for free download directly from Anaconda’s website. The for-pay enterprise edition, with professional support, requires contacting the Anaconda (formerly Continuum Analytics) sales team.

Current version: What’s new in Anaconda 5.2

This enterprise edition of Anaconda, released this week, adds new features around job scheduling, integration with Git, and GPU acceleration.
Earlier versions of Anaconda Enterprise were built to allow professionals to leverage multiple machine learning libraries in a business context—TensorFlow, MXNet, Scikit-learn, and more. In version 5.2, Anaconda offers ways to train models on a securely shared central cluster of GPUs, so that models can be trained faster and more cost-effectively.


Also new in Anaconda Enterprise is the ability to integrate with external code repositories and continuous integration tools, such as Git, Mercurial, GitHub, and Bitbucket. A new job scheduling system allows tasks to be run at regular intervals—for instance, to retrain a model on new data.
Changes in the community version include the following:
  • Security fixes for 20 or so packages, based on CVE analyses.
  • Fixes to the Windows installer to prevent using invalid install paths or causing collisions with existing software components.
  • Better use of working directories on Windows in multi-user installation scenarios.

Previous version: What’s new in Anaconda 5.1

Anaconda 5.1, and the point fixes that followed, have mostly been minor touch-ups to both the enterprise and community editions.
Some notable changes to the enterprise edition include a new post-install setup script and GUI that ease the post-configuration needed with a new Anaconda Enterprise install (for instance, when setting up TLS certificates). You also have the ability to generate “custom Anaconda installers, parcels for Cloudera CDH, and management packs for Hortonworks HDP.” Changes to the community edition include the ability to use Microsoft Visual Studio Code as an editor option at install time.


Previous version: What’s new in Anaconda 5.0

The Linux and MacOS versions of Anaconda 5 have been built with new compilers: GCC 7.2 for Linux and Clang 4.01 for MacOS. This extends the speed benefits of those compilers to users of earlier editions of those OSes—to MacOS 10.9 Mavericks and CentOS 6.
Anaconda 5 also provides Python packages rebuilt with the new compiler, through its package-management tool conda. However, for the time being, those rebuilt packages are available through a different installation channel.
Anaconda’s long-term plan is to make that new installation channel the default, as more packages get added to the new channel and as users obtain the newly optimized packages and give them a shakedown.


Anaconda’s conda tool simplifies installing Python packages used in stats and data analysis, because many of those packages have complex binary dependencies. Conda-forge is a GitHub organization where users can share packages, build recipes, and distributions of projects built for conda.
Some 3,200 packages from Conda-forge are available in their own package list. Among some of the most recently updated:
  • cassandra-driver, a Python module for working with Apache Cassandra and its binary data-access protocol.
  • pyinstaller, for bundling a Python app as a self-contained executable.
  • plotly, an interactive graphing library.
  • openblas, a library for basic vector and matrix math.
Anaconda’s strategy moving foward is to use Conda-forge as its source for build recipes, both for consistency’s sake and to allow a broader range of third-party packages to be used in Anaconda.
Also new in Anaconda 5.0:
  • More than 100 packages available through conda have been updated or revised. One major project for accelerating computational speeds on conventional CPUs, the Intel Math Kernel Library, is now available in version 2018.0.0.
  • NumPy users can now work with a wider range of versions of that popular math and statistics package. Other packages in Anaconda’s suite may depend on different versions of NumPy, but users may want access to the latest and greater version. (Anaconda’s term for this is “dependency pinning.”)
  • R language users now have access to R version 3.4.2. All of R’s packages, including RStudio, were rebuilt using Anaconda’s new compilers
    .

Sunday, April 22, 2018

7 books you must read to be a real software developer It’s easy to learn to be a coder. But knowing how to code isn’t enough to get and keep a real job in software development

From: https://www.infoworld.com/article/3269032/application-development/7-books-you-must-read-to-be-a-real-software-developer.html


It’s easy to learn to be a coder. But knowing how to code isn’t enough to get and keep a real job in software development

7 books you must read to be a real software developer
Stevepb (CC0)
Congratulations on finishing your four-year computer science degree in two years with no actual practical software development experience or attending your coding bootcamp!
But there are a few more things you should know. And there are a few more things you should read.


  1. Code Complete: A Practical Handbook of Software Construction, Second Edition. You learned how to code and all, but did you learn when to code and what to code? Moreover, there are a number of things that you should probably know (like why Booleans may not make great status variables). While there is some dust even on the second edition, there is gold here.
  2. The Mythical Man-Month. Most problems that will happen on your first professional software project are explained in this book. Read it before your first job, but don’t quote it to people (enough people do that, and it just comes off as smug). I suppose you could also just read the complete works of Dilbert, but MMM is shorter.
  3. The Pragmatic Programmer: From Journeyman to Master. This book ages pretty well. Actually, it takes off where Code Complete ends. It is also much shorter.
  4. Design Patterns: Elements of Reusable Object-Oriented Software. The so-called Gang of Four book helps you learn the metapatterns of programming. This will save you from inventing your own whatever framework because you’ll realize that you have invented nothing new. It also will help you think about things in the right way.
  5. Extreme Programming Explained. Whether they do XP on the job or some chaotic adaptation of scrum that smells awfully waterfally (like most companies), this book teaches you how software development should probably work if anyone were motivated to do it right. Don’t worry, very few companies actually do pair programming. Though I admit it is probably good for you, if you don’t drive the other person to murder.
  6. Refactoring: Improving the Design of Existing Code. Your dream of creating anything from scratch is likely to be daunted. Almost everything has legacy code. You’ll spend most of your career dealing with crap code created by people who write like they just finished code camp (no offense)—or stuff created by “the offshore team” (which consists of the people who just finished the two-year version of a four-year computer science program). You’ll rarely be given enough time to rewrite it. Instead, learn how to refactor it.
  7. UML Distilled: A Brief Guide to the Standard Object Modeling Language, Third Edition. A good 70 percent of UML was a useless farce to sell overpriced clunky tools (looking at you, Rational Rose). Don’t learn UML to go around annoying people with useless class diagrams. Do learn the basics so you can read a sequence diagram and learn to think this way.

Thursday, April 5, 2018

https://www.infoworld.com/article/3268073/containers/what-is-kubernetes-container-orchestration-explained.html

Source: https://www.infoworld.com/article/3268073/containers/what-is-kubernetes-container-orchestration-explained.html

How the Kubernetes open source project from Google makes containerized applications astonishingly easy to deploy, scale, and manage


The rise of containers has reshaped the way people think about developing, deploying, and maintaining software. Drawing on the native isolation capabilities of modern operating systems, containers support VM-like separation of concerns, but with far less overhead and far greater flexibility of deployment than hypervisor-based virtual machines.
Containers are so lightweight and flexible, they have given rise to new application architectures. The new approach is to package the different services that constitute an application into separate containers, and to deploy those containers across a cluster of physical or virtual machines. This gives rise to the need for container orchestration—a tool that automates the deployment, management, scaling, networking, and availability of container-based applications.
Enter Kubernetes. This open source project spun out of Google automates the process of deploying and managing multi-container applications at scale. While Kubernetes works mainly with Docker, it can also work with any container system that conforms to the Open Container Initiative (OCI) standards for container image formats and runtimes. And because Kubernetes is open source, with relatively few restrictions on how it can be used, it can be used freely by anyone who wants to run containers, most anywhere they want to run them.

Kubernetes vs. Docker

Note that Kubernetes isn’t a replacement for Docker. However, Kubernetes is a replacement for some of the higher-level technologies that have emerged around Docker.
One such technology is Docker Swarm, an orchestrator bundled with Docker. It’s still possible to use Swarm instead of Kubernetes, but Docker Inc. has chosen to make Kubernetes part of the Docker Community and Docker Enterprise editions going forward. 
Not that Kubernetes is a drop-in replacement for Swarm. Kubernetes is significantly more complex than Swarm, and requires more work to deploy. But again, the work is intended to provide a big payoff in the long run—a more manageable, resilient application infrastructure. For development work, and smaller container clusters, Docker Swarm presents a simpler choice. 

Related video: What is Kubernetes?

What Kubernetes does for containers

High-level languages, like Python or C#, provide the user with abstractions and libraries so they can focus on accomplishing the tasks at hand, instead of getting mired down in details of memory management.
Kubernetes works the same way with container orchestration. It provides high-level abstractions for managing groups of containers that allow Kubernetes users to focus on how they want applications to run, rather than worrying about specific implementation details. The behaviors they need are decoupled from the components that provide them.
Here are some of the tasks Kubernetes is designed to automate and simplify:
Deploy multi-container applications. Many applications don’t live in just one container. They’re built out of a bundle of containers—a database here, a web front end there, perhaps a caching server. Microservices are constructed in this fashion as well, typically drawing on separate databases for each service and web protocols and APIs to tie services together. Although there are long-term advantages to building apps as microservices, it comes with a lot of near-term heavy lifting.
Kubernetes reduces the amount of work needed to implement such applications. You tell Kubernetes how to compose an app out of a set of containers, and Kubernetes handles the nitty-gritty of rolling them out, keeping them running, and keeping the components in sync with each other.
Scale containerized apps. Apps need to be able to ramp up and down to suit demand, to balance incoming load, and make better use of physical resources. Kubernetes has provisions for doing all these things, and for doing them in an automated, hands-off way.
Roll out new versions of apps without downtime. Part of the appeal of a container-based application development workflow is to enable continuous integration and delivery. Kubernetes has mechanisms for allowing graceful updates to new versions of container images, including rollbacks if something goes awry.
Provide networking, service discovery, and storage. Kubernetes handles many other fiddly details of container-based apps. Getting containers to talk to each other, handling service discovery, and providing persistent storage to containers from various providers (e.g., Amazon’s EBS) are all handled through Kubernetes and its APIs.
Do all this in most any environment. Kubernetes isn’t tied to a specific cloud environment or technology. It can run wherever there is support for containers, which means public clouds, private stacks, virtual and physical hardware, and a single developer’s laptop are all places for Kubernetes to play. Kubernetes clusters can also run on any mix of the above. This even includes mixes of Windows and Linux systems

How Kubernetes works

Kubernetes’s architecture makes use of various concepts and abstractions. Some of these are variations on existing, familiar notions, but others are specific to Kubernetes.
The highest-level Kubernetes abstraction, the cluster, refers to the group of machines running Kubernetes (itself a clustered application) and the containers managed by it. A Kubernetes cluster must have a master, the system that commands and controls all the other Kubernetes machines in the cluster. A highly available Kubernetes cluster replicates the master’s facilities across multiple machines. But only one master at a time runs the job scheduler and controller-manager.
Each cluster contains Kubernetes nodes. Nodes might be physical machines or VMs. Again, the idea is abstraction: whatever the app is running on, Kubernetes handles deployment on that substrate. It is also possible to ensure that certain containers run only on VMs or only on bare metal.
Nodes run pods, the most basic Kubernetes objects that can be created or managed. Each pod represents a single instance of an application or running process in Kubernetes, and consists of one or more containers. Kubernetes starts, stops, and replicates all containers in a pod as a group. Pods keep the user’s attention on the application, rather than on the containers themselves. Details about how Kubernetes needs to be configured, from the state of pods on up, is kept in Etcd, a distributed key-value store.
Pods are created and destroyed on nodes as needed to conform to the desired state specified by the user in the pod definition. Kubernetes provides an abstraction called a controller for dealing with the logistics of how pods are spun up, rolled out, and spun down. Controllers come in a few different flavors depending on the kind of application being managed. For instance, the recently introduced “StatefulSet” controller is used to deal with applications that need persistent state. Another kind of controller, the deployment, is used to scale an app up or down, update an app to a new version, or roll back an app to a known-good version if there’s a problem.
Because pods live and die as needed, we need a different abstraction for dealing with the application lifecycle. An application is supposed to be a persistent entity, even when the pods running the containers that comprise the application aren’t themselves persistent. To that end, Kubernetes provides an abstraction called a service.
A service describes how a given group of pods (or other Kubernetes objects) can be accessed via the network. As the Kubernetes documentation puts it, the pods that constitute the back end of an application might change, but the front end shouldn’t have to know about that or track it. Services make this possible.
A few more pieces internal to Kubernetes round out the picture. The scheduler parcels out workloads to nodes so that they’re balanced across resources and so that deployments meet the requirements of the application definitions. The controller manager ensures the state of the system—applications, workloads, etc.—matches the desired state defined in Etcd’s configuration settings.
It’s important to keep in mind that none of the low-level mechanisms used by containers, like Docker itself, are replaced by Kubernetes. Rather, Kubernetes provides a larger set of abstractions for using them for the sake of keeping apps running at scale.

How Kubernetes makes containerized apps easier

Because Kubernetes introduces new abstractions and concepts, and because the learning curve for Kubernetes is high, it’s only normal to ask what the long-term payoffs are for using Kubernetes. Here’s a rundown of some of the specific ways running apps inside Kubernetes becomes easier.
Kubernetes manages app health, replication, load balancing, and hardware resource allocation for you. One of the most basic duties Kubernetes takes off your hands is the busywork of keeping an application up, running, and responsive to user demands. Apps that become “unhealthy,” or don’t conform to the definition of health you describe for them, can be automatically healed.
Another benefit Kubernetes provides is maximizing the use of hardware resources including memory, storage I/O, and network bandwidth. Applications can have soft and hard limits set on their resource usage. Many apps that use minimal resources can be packed together on the same hardware; apps that need to stretch out can be placed on systems where they have room to groove. And again, rolling out updates across a cluster, or rolling back if updates break, can be automated.
Kubernetes Helm charts ease the deployment of preconfigured applications. Package managers such as Debian Linux’s APT and Python’s Pip save users the trouble of manually installing and configuring an application. This is especially handy when an application has multiple external dependencies.
Helm is something like a package manager for Kubernetes. Many popular software applications must run as multiple, ganged-together containers in Kubernetes. Helm provides a definition mechanism, a “chart,” that describes how a given piece of software can be run as a group of containers inside Kubernetes.
You can create your own Helm charts from scratch, and you might have to if you’re building a custom app to be deployed internally. But if you’re using a popular application that has a common deployment pattern, there is a good chance someone has already composed a Helm chart for it and published it by way of the Kubeapps.com directory.
Kubernetes simplifies management of storage, secrets, and other application-related resources. Containers are meant to be immutable; whatever you put into them isn’t supposed to change. But applications need state, meaning they need a reliable way to deal with external storage volumes. That’s made all the more complicated by the way containers live, die, and are reborn across the lifetime of an app.
Kubernetes has abstractions to allow containers and apps to deal with storage in the same decoupled way as other resources. Many common kinds of storage, from Amazon EBS volumes to plain old NFS shares, can be accessed via Kubernetes storage drivers, called volumes. Normally, volumes are bound to a specific pod, but a volume subtype called a “Persistent Volume” can be used for data that needs to live on independently of any pod.
Containers often need to work with “secrets”—credentials like API keys or service passwords that you don’t want hardwired in a container or stashed openly on a disk volume. While third-party solutions are available for this, like Docker secrets and HashiCorp Vault, Kubernetes has its own mechanism for natively handling secrets, although it does need to be configured with care. For instance, Etcd must be configured to use SSL/TLS when sending information including secrets between nodes, rather than in plaintext. 
Kubernetes apps can run in hybrid and multi-cloud environments. One of the long-standing dreams of cloud computing is to be able to run any app in any cloud, or any mix of clouds public or private. This isn’t just to avoid vendor lock-in, but also to take advantage of features specific to individual clouds.

Wednesday, February 21, 2018

Docker tutorial: Get started with Docker Compose

Learn how to use Docker’s native service configuration and deployment tool for testing and debugging multi-container apps

  •  
  •  
  •  
  •  
  •  
  •  
  •  
Containers are meant to provide component isolation in a modern software stack. Put your database in one container, your web application in another, and they can all be scaled, managed, restarted, and swapped out independently. But developing and testing a multi-container application isn’t anything like working with a single container at a time.
Docker Compose was created by Docker to simplify the process of developing and testing multi-container applications. It’s a command-line tool, reminiscent of the Docker client, that takes in a specially formatted descriptor file to assemble applications out of multiple containers and run them in concert on a single host. (Tools like Docker Swarm or Kubernetes deploy multi-container apps in production across multiple hosts.)
In this tutorial, we’ll walk through the steps needed to define and deploy a simple multi-container web service app. While Docker Compose is normally used for development and testing, it can also be used for deploying production applications. For the sake of this discussion, we will concentrate on dev-and-test scenarios.