Start scientific computing on a new OS X

You got your new (expensive) Mac computer. Exciting! The next step, you want to use it for scientific computing/data analysis, e.g. deep learning, algorithm developments, machine learning, etc., let me help you get started.

你拿到了新(贵)的Mac电脑,灰常激动!下一步你打算用它来做科学计算或者是数据分析,比如深度学习、开发算法、机器学习等等。本帖将介绍如何设置好你的Mac系统来做这些工作。

First of all, the programming environment on an OS X system is based on a software that is found in the Apple App Store. You need to download the newest version to enable the newest features. Its name is XCode. You’ll need to sign in with your Apple ID. It’s free to create a new one.

首先,OS X系统上的编程环境依赖于一个要在Apple App Store里下载的软件,没错,这个软件你必须在App Store下载。第一步就是到App商店下载最新版的这个软件,它的名字叫:XCode. 想开始下载你需要先用Apple ID登录。如果你还没有账户可以建立一个,免费。

Now you are ready to install the next important tool called macports. Macports integrated almost all useful packages for you to do programming. Here is its official website. Download it and install. Refer to the installation guide on the website for instructions.

接下来需要装一个非常重要的Mac上常见的软件。这个软件叫macports. Macports是一个安装包管理软件,它整合了基本上所有的常用编程工具和库。到下面的官方网站去下载并安装macports. 参见官网给的安装说明完成安装。

https://www.macports.org/

After it’s installed, open a Terminal window (which is found in Launchpad-Other), type the following command

安装好后,打开一个Terminal窗口(通过点击Launchpad-Other来找到,或者可以在Lauchpad的搜索栏里搜索”Terminal”)

xcode-select --install

This installs all the command line tools macports needs. Then, accept the XCode license by

上面这个命令会安装所有的macports需要的命令行工具。之后,你需要运行以下命令接受XCode的用户使用条款。

xcodebuild -license

Now you are ready to install ports from macports. The following website of macports listed some useful commands to operate with port.

现在一切就绪,你可以安装macports提供的ports了。macports官网提供了一些有用的命令来使用port。详情可以看以下网站:
https://guide.macports.org/#using

To install a port package, you need the sudo privilege. If you do not, ask the admin of the computer to install or create one for you. The command to install package named packagename is

要想安装port包,你需要可以执行sudo的权限,即是管理员权限。如果你还没有,询问系统管理员让他/她安装或者创建权限给你。安装名字为packagename的命令是:

sudo port install packagename

You may go through the available list for all the ports you can install. From my own perspective, I found the following ones particularly useful. Notice that I specified python 3.7 rather than python 2 for the Python packages. This change was made recently (Aug 2018) because the Numpy community is dropping the support for Python 2, and most other communities are doing the same. I believe it is time to switch:

在官网的”Available ports”栏里,你可以搜索或者列出所有可用的ports。输入关键词即可搜索。个人认为以下的ports在使用时非常有用。有一点要注意的是,我使用了Python 3.7作为默认Python版本,而不是Python 2。这主要是因为Numpy社区准备取消对Python 2的支持,而其他的社区也在做类似的决定。我个人觉得,是时候放弃Python 2转成3 了。:

  • python37, py37-numpy, py37-matplotlib, py37-ipython, py37-notebook
  • inkscape
  • cmake
  • openmpi

macports manages the dependencies for each package automatically, so that if you install one, all the dependencies will be installed with them. So in the end you don’t have to specify all the packages manually because most of them will be installed with others. For example, to check the dependencies of py37-notebook, you can do

macports 为每一个安装包自动管理相应的需求包。当你安装一个包时,所需要的其他包也会一起安装。所以说,最后你并不需要手动输入所有需要的包,因为多数都会跟着别的包一起安装。比如,要获取py37-notebook的需求包,你可以用一下命令:

$ port echo depends:py37-notebook
py37-jupyter                    
py37-jupyterlab                 
py37-jupyterlab_launcher        
py37-metakernel                 
py37-widgetsnbextension

Running other commands in Terminal requires you to have known basics about SHELL scripting already. You may now install many IDEs to start writing codes in languages you like. If you would like to know where the compiler/interpreter is, you may try the which command. e.g.

你需要知道一些基本的SHELL编程知识来运行Terminal命令。现在电脑上已经有编译器和解释器了,所以你可以安装IDE来编程了。如果你想在IDE里指定编译器或者解释器的位置,而并不知道它们在哪里,你可以用which命令来获取它们的路径。比如

$ which cmake
/opt/local/bin/cmake

The command returned with the path to cmake on your Mac. Usually macports installs its binaries to the /opt/local/bin/ directory. Your Python 3 is very likely to be there too: /opt/local/bin/python. If it doesn’t return anything, then this executable is not available yet. For more details, see some introductory docs about $PATH variable in SHELL.

这个命令返回了你的Mac上cmake的路径。一般来说,macports会把可执行文件安装在/opt/local/bin/路径,所以你的Python 3也很可能在那里: /opt/local/bin/python. 如果命令没有返回任何结果,那说明你找的可执行文件还不存在。关于这个命令的详情,可以参见在SHELL环境里设置$PATH变量的介绍文档。

Another very useful command for port is to search for an available package with a keyword. Do:

另一个非常有用的port命令是根据关键词查询可使用的包。执行以下命令:

port search [--name] [--regex] '<searchtext>'

[--name] and [--regex] are both optional. e.g. if I search for a very useful program for vector diagram drawings, called inkscape, the return will be like shown in the following figure.

[--name] and [--regex] 都是可有可无的。比如,如果我想找一个叫inkscape的可以画矢量图的好用软件,命令返回结果如下图。

As for the common IDEs to use on a Mac, I recommend:

至于在Mac上好用的IDE,我推荐以下几个:

  • For C++/C or Fortran programming: Netbeans (Download online)
  • For Python programming: PyCharm (Download online)
  • For general programming: MacVim (Available as a macports port)
  • For HTML, CSS, JavaScript, etc. (website design): Coda (Download online)

One next important question is, since ports often depend on others, for example, py37-notebook depends on py37-jupyter_core, but each port maintains its own version, how would one make sure each port is up-to-date, or at least should work flawlessly with each other? How would you make sure the Python package you are developing can be readily adopted by others and they have the required version of Python modules (e.g. if you developed your codes with Numpy-1.15.4, but your user installed 1.14?). There is an optional software you can use to solve this headache, which I will explain in another post:

还有个重要的问题就是,由于这些ports依赖其他的ports来工作,比如py37-notebook依赖于py37-jupyter_core,但每个port都维护着自己的版本,你怎么才能确保所有的port都是最新的,或者至少是互相不起冲突呢?你又怎么去确定你开发的Python程序可以被用户直接使用,前提是这个用户已经有了你需要的module版本呢?(例如你需要numpy 1.15.4 版本,而用户安装的是1.14?)有一个软件可以解决这个问题,我将在另外的帖子里介绍这个问题:

Anaconda, should I bother?