Wednesday, October 26, 2016

Bedrock

"Bedrock is a simple, modular, WAN-replicated data foundation for global-scale applications.

 Bedrock was built by Expensify, and is a networking and distributed transaction layer built atop SQLite, the fastest, most reliable, and most widely distributed database in the world."

http://bedrockdb.com/

https://github.com/Expensify/Bedrock

mbed

"ARM mbed OS is an open source embedded operating system designed specifically for the "things" in the Internet of Things.

It includes all the features you need to develop a connected product based on an ARM Cortex-M microcontroller, including security, connectivity, an RTOS, and drivers for sensors and I/O devices.


Monday, October 17, 2016

D4M.jl

"A Dynamic Distributed Dimensional Data Model(D4M) module for Julia.

 D4M is a breakthrough in computer programming that combines the advantages of five distinct processing technologies (sparse linear algebra, associative arrays, fuzzy algebra, distributed arrays, and triple-store/NoSQL databases such as Hadoop HBase and Apache Accumulo) to provide a database and computation system that addresses the problems associated with Big Data. D4M significantly improves search, retrieval, and analysis for any business or service that relies on accessing and exploiting massive amounts of digital data."

 https://github.com/achen12/D4M.jl

http://www.mit.edu/%7Ekepner/D4M/

 Julia Implementation of the Dynamic Distributed Dimensional Data Model - https://arxiv.org/abs/1608.04041

Introducing D3 Science:  Understanding Applications and Infrastructure - https://arxiv.org/abs/1609.03647

D4M 3.0 - https://arxiv.org/abs/1702.03253



Futhark

"Futhark is a small programming language designed to be compiled to efficient GPU code. It is a statically typed, data-parallel, and purely functional array language, and comes with a heavily optimising ahead-of-time compiler that generates GPU code via OpenCL. Futhark is not designed for graphics programming, but instead uses the compute power of the GPU to accelerate data-parallel array computations. We support regular nested data-parallelism, as well as a form of imperative-style in-place modification of arrays, while still preserving the purity of the language via the use of a uniqueness type system.

 Futhark is not intended to replace your existing languages. Our intended use case is that Futhark is only used for relatively small but compute-intensive parts of an application. The Futhark compiler generates code that can be easily integrated with non-Futhark code. For example, you can compile a Futhark program to a Python module that internally uses PyOpenCL to execute code on the GPU, yet looks like any other Python module from the outside (more on this here). The Futhark compiler will also generate more conventional C code, which can be accessed from any language with a basic FFI."

http://futhark-lang.org/

https://github.com/diku-dk/futhark/

https://github.com/HIPERFIT/futhark

 APL on GPUs: A TAIL from the Past, Scribbled in Futhark - http://hgpu.org/?p=16592

https://github.com/HIPERFIT/futhark-fhpc16

https://github.com/melsman/apltail/

Purely Functional GPU Programming with Futhark - https://fosdem.org/2017/schedule/event/functional_gpu_futhark/

Design and Implementation of the Futhark Programming Language - https://hgpu.org/?p=17903

Friday, October 14, 2016

GPU & DB Literature

GPU-accelerated database systems:  Survey and open challenges

http://hgpu.org/?p=12738


Overtaking CPU DBMSes with a GPU in Whole-Query Analytic Processing with Parallelism-Friendly Execution Plan Optimization

http://hgpu.org/?p=16615

Parallel Inception

https://archive.fosdem.org/2016/schedule/event/hpc_bigdata_mpp/

https://github.com/kdunn926/plpygpgpu/blob/master/notebook.ipynb 

CUDAnative

"This package provides support for compiling and executing native Julia kernels on CUDA hardware. It is a work in progress, highly experimental, and for now requires a version of Julia capable of generating PTX code (ie. the fork at JuliaGPU/julia)."

https://github.com/JuliaGPU/CUDAnative.jl

Wednesday, October 12, 2016

pandasql

"pandasql allows you to query pandas DataFrames using SQL syntax. It works similarly to sqldf in R. pandasql seeks to provide a more familiar way of manipulating and cleaning data for people new to Python or pandas."

https://github.com/yhat/pandasql

http://blog.yhat.com/posts/pandasql-intro.html

Ultibo

"Ultibo core is an embedded or bare metal development environment for Raspberry Pi. It is not an operating system but provides many of the same services as an OS, things like memory management, networking, filesystems and threading plus much more. So you don’t have to start from scratch just to create your ideas."

https://ultibo.org/

Vagrant

"Vagrant is a tool for building and distributing development environments.

Development environments managed by Vagrant can run on local virtualized platforms such as VirtualBox or VMware, in the cloud via AWS or OpenStack, or in containers such as with Docker or raw LXC.

Vagrant provides the framework and configuration format to create and manage complete portable development environments. These development environments can live on your computer or in the cloud, and are portable between Windows, Mac OS X, and Linux.

Vagrant is an open-source software product for building and maintaining portable virtual development environments.[4] The core idea behind its creation lies in the fact that the environment maintenance becomes increasingly difficult in a large project with multiple technical stacks. Vagrant manages all the necessary configurations for the developers in order to avoid the unnecessary maintenance and setup time, and increases development productivity. Vagrant is written in the Ruby language, but its ecosystem supports development in almost all major languages.

Vagrant uses "Provisioners" and "Providers" as building blocks to manage the development environments. Provisioners are tools that allow users to customize the configuration of virtual environments. Puppet and Chef are the two most widely used provisioners in the Vagrant ecosystem. Providers are the services that Vagrant uses to set up and create virtual environments. Support for VirtualBox, Hyper-V, and Docker virtualization ships with Vagrant, while VMware and AWS are supported via plugins.

Vagrant sits on top of virtualization software as a wrapper and helps the developer interact easily with the providers. It automates the configuration of virtual environments using Chef or Puppet, and the user does not have to directly use any other virtualization software. Machine and software requirements are written in a file called "Vagrantfile" to execute necessary steps in order to create a development-ready box. Box is a format and an extension ( .box) for Vagrant environments that is copied to another machine in order to replicate the same environment."


https://github.com/mitchellh/vagrant

https://www.vagrantup.com/

Vagrant Tutorial - https://manski.net/2016/09/vagrant-multi-machine-tutorial/

How to Create a CentOS Vagrant Base Box - https://github.com/ckan/ckan/wiki/How-to-Create-a-CentOS-Vagrant-Base-Box

Using Ansible to Provision Vagrant Boxes - https://fedoramagazine.org/using-ansible-provision-vagrant-boxes/


Virtualize OSX on Linux

"I've been a Linux user for something like 10 years now. In order to develop and maintain psutil on different platforms I've been using the excellent VirtualBox. With it, during the years, I've been able to virtualize different versions of Windows, FreeBSD, OpenBSD, NetBSD and Solaris and implement and protract support for such platforms inside psutil. Without VirtualBox there really wouldn't exist psutil as it stands nowadays.

At some point I also managed to virtualize OSX by using an hacked version of OSX called iDeneb which is based on OSX 10.5 / Leopard (note: 9 years old), and that is what I've been using up until today. Of course such an old hacked version of OSX isn't nice to deal with. It ships Python 2.5, it kernel panicks, I had to reinstall it from scratch quite often.

I'm really not sure how I could  have been missing this for all this time, but it turns out emulating OSX on Linux really is as easy as executing a one-liner:
vagrant init AndrewDryga/vagrant-box-osx; vagrant up
And that really is it! I mean... you're literally good to go and start developing! That will create a Vagrant file, download a pre-configured OSX image via internet (10GB or something) and finally run it in VirtualBox. The whole package includes:

OSX 10.10.4 / Yosemite

XCode 6.4 + gcc

brew

Python 2.7

In a couple of hours I modified the original Vagrantfile a little and managed to mount a directory which is shared between the VM and the host (my laptop) and ended up with this Vagrantfile."

http://grodola.blogspot.com/2016/10/virtualize-osx-on-linux_53.html

https://atlas.hashicorp.com/AndrewDryga/boxes/vagrant-box-osx/

Wednesday, October 5, 2016

pyMIC

"Python module to offload computation in a Python program to the Intel Xeon Phi coprocessor. It contains offloadable arrays and device management functions. It supports invocation of native kernels (C/C++, Fortran) and blends in with Numpy's array types for float, complex, and int data types."

https://github.com/01org/pyMIC

https://software.intel.com/en-us/articles/pymic-a-python-offload-module-for-the-intelr-xeon-phitm-coprocessor

https://www.euroscipy.org/2015/schedule/presentation/9/

https://arxiv.org/abs/1607.00844

Mininet

"Mininet creates a realistic virtual network, running real kernel, switch and application code, on a single machine (VM, cloud or native), in seconds, with a single command."

http://mininet.org/

Tuesday, October 4, 2016

conventions


NACDD - https://geo-ide.noaa.gov/wiki/index.php?title=NetCDF_Attribute_Convention_for_Dataset_Discovery

NCEI NetCDF Templates - https://www.nodc.noaa.gov/data/formats/netcdf/v2.0/

NetCDF CF - http://cfconventions.org/

UGRID - https://github.com/ugrid-conventions/ugrid-conventions

OpenHPC

"OpenHPC is a collaborative, community effort that initiated from a desire to aggregate a number of common ingredients required to deploy and manage High Performance Computing (HPC) Linux clusters including provisioning tools, resource management, I/O clients, development tools, and a variety of scientific libraries. Packages provided by OpenHPC have been pre-built with HPC integration in mind with a goal to provide re-usable building blocks for the HPC community. Over time, the community also plans to identify and develop abstraction interfaces between key components to further enhance modularity and interchangeability. The community includes representation from a variety of sources including software vendors, equipment manufacturers, research institutions, supercomputing sites, and others.

OpenHPC provides pre-built binaries via repositories for use with standard Linux package manager tools (e.g. yum or zypper). Package repositories are housed at https://build.openhpc.community. To get started, you can enable an OpenHPC repository locally through installation of an ohpc-release RPM which includes gpg keys for package signing and defines the URL locations for [base] and [update] package repositories."

http://openhpc.community/

https://github.com/pmodels/ohpc

https://build.openhpc.community/

https://www.nextplatform.com/2016/11/28/openhpc-pedal-put-compute-metal/

https://archive.fosdem.org/2016/schedule/event/hpc_bigdata_openhpc/

BOLT

"BOLT targets a high-performing OpenMP implementation, especially specialized for fine-grain parallelism. Unlike other OpenMP implementations, BOLT utilizes a lightweight threading model for its underlying threading mechanism. It currently adopts Argobots, a new holistic, low-level threading and tasking runtime, in order to overcome shortcomings of conventional OS-level threads. Its runtime and compiler are based on the OpenMP runtime and Clang in LLVM, respectively."

http://www.mcs.anl.gov/bolt/

https://github.com/pmodels/bolt-runtime

https://github.com/pmodels/argobots

https://wiki.mpich.org/mpich/index.php/MPI%2BArgobots

Sunday, October 2, 2016

pgAdmin

"pgAdmin 4 is a complete rewrite of pgAdmin, built using Python and Javascript/jQuery. A desktop runtime written in C++ with Qt allows it to run standalone for individual users, or the web application code may be deployed directly on a webserver for use by one or more users through their web browser. The software has the look and feel of a desktop application whatever the runtime environment is, and vastly improves on pgAdmin III with updated user interface elements, multi-user/web deployment options, dashboards and a more modern design."

https://www.pgadmin.org/

gitless

"Gitless is an experimental version control system built on top of Git. Many people complain that Git is hard to use. We think the problem lies deeper than the user interface, in the concepts underlying Git. Gitless is an experiment to see what happens if you put a simple veneer on an app that changes the underlying concepts. Because Gitless is implemented on top of Git (could be considered what Git pros call a "porcelain" of Git), you can always fall back on Git. And of course your coworkers you share a repo with need never know that you're not a Git aficionado."

http://gitless.com/