onsdag 3 februari 2016

FuseSoC 1.4

When I consider the magnitude of the subject which I am to bring before my readers-a subject, in which the interests, not of this country, nor of Europe alone, but of the whole world, and of posterity, are involved: and when I think, at the same time, on the weakness of the advocate who has undertaken this great cause-when these reflections press upon my mind, it is impossible for me not to feel both terrified and concerned at my own inadequacy to such a task. But when I reflect, however, on the encouragement which I have had, through the whole course of a long and laborious examination of this question, and how much candour I have experienced, and how conviction has increased within my own mind, in proportion as I have advanced in my labours;-when I reflect, especially, that however averse any gentleman may now be, yet we shall all be of one opinion in the end;-when I turn myself to these thoughts, I take courage-I determine to forget all my other fears, and I march forward with a firmer step in the full assurance that my cause will bear me out, and that I shall be able to justify my decision to release FuseSoC 1.4.

William Wilberforce, a man who apparently had a thing for extremely long sentences, might have said something a bit similar about the end of slavery, but that's not the topic for today. Today's topic is the new FuseSoC release. We now live in a world of git snapshots (and less slavery than 1789), and I don't expect an increase in the number after the decimal point in the configure script matter to most people. Still, I think it's nice to do a stable release once in a while. It helps packagers (where are you by the way? I'm still waiting for someone to make a FuseSoC rpm or deb), and more importantly, I can take this as an opportunity to break things. This will be the first version where a part of CAPI1 gets deprecated. There are some other noteworthy changes since FuseSoC 1.3, and although I have summarized the changes in the NEWS file, and I recently wrote a post about the new IP-Xact support there are other changes that could be worth explaining in further detail. I do realize that this would not be needed if someone had written proper documentation (where are you by the way? I'm still waiting for someone to write the documentation for me), but no one has, so this will have to do for now.

Prominent features

 

 File sets

Up until now, source code has been mainly verilog, with some support for C(++) for Verilator testbenches and VPI modules, together with a very limited support for VHDL in the Quartus backend flow. This part has been completely overhauled in this release, where filesets are introduced. Filesets is modeled on the IP-Xact filesets, and these will have a stronger connection in the future, as IP-Xact support is also introduced in this version. The verilog section will be deprecated at some point, so I suggest moving to filesets already. Here's a quick example on how to make the switch. The verilog section of the de0 nano system...

[verilog]
src_files =
 rtl/verilog/clkgen.v
 rtl/verilog/orpsoc_top.v
 backend/rtl/verilog/pll.v
 rtl/verilog/wb_intercon.v
 rtl/verilog/wb_intercon_dbg.v

tb_private_src_files =
 bench/orpsoc_tb.v
 bench/uart_decoder.v
include_files =
 rtl/verilog/include/or1200_defines.v
 rtl/verilog/include/orpsoc-defines.v
 rtl/verilog/wb_intercon.vh
 rtl/verilog/wb_intercon_dbg.vh
 sw/clear_r3_and_jump_to_0x100.vh
 sw/spi_uimage_loader.vh

tb_include_files =
 bench/spi_image.vh
 bench/test-defines.v



...becomes...

[fileset rtl_files]
files =
 rtl/verilog/clkgen.v
 rtl/verilog/orpsoc_top.v
 backend/rtl/verilog/pll.v
 rtl/verilog/wb_intercon.v
 rtl/verilog/wb_intercon_dbg.v
file_type = verilogSource
usage = sim synth

[fileset tb_files]
files =
 bench/orpsoc_tb.v
 bench/uart_decoder.v
file_type = verilogSource
usage = sim

[fileset include_files]
files =
 rtl/verilog/include/or1200_defines.v
 rtl/verilog/include/orpsoc-defines.v
 rtl/verilog/wb_intercon.vh
 rtl/verilog/wb_intercon_dbg.vh
 sw/clear_r3_and_jump_to_0x100.vh
 sw/spi_uimage_loader.vh
file_type = verilogSource
is_include_file = true
usage = sim synth

[fileset tb_include_files]
files =
 bench/spi_image.vh
 bench/test-defines.v
file_type = verilogSource
is_include_file = true
usage = sim


It's a few more lines, but the added flexibility outweighs the few extra bytes of ascii characters. It's now possible to set other file types, such as vhdlSource, or request specific language versions such as verilogSource-2001. The available file types are enumerated in the IP-Xact standard with some addtional ones allowed by FuseSoC (verilogSource-2005 and vhdlSource-2008 for now). The name of a fileset is not important, but the filesets are parsed in the order they appear in the .core file, so make sure that they are correctly ordered when dealing with languages where this is important.

The usage tag can be set to one or more items, and determines which tools that will use the fileset. These can be either a category (sim, synth) or a specific tool (verilator, quartus, modelsim...)

For languages that has a concept of libraries (VHDL, not verilog), the logical_name tag can be used to indicate library affiliation.


Another new feature is per-file attributes, which can be used to override the default value for the fileset. These attributes are placed inside square brackets at the end of the file. They are comma-separated and are either of the form attribute=value, or just attribute, to set boolean attributes. With per-file attributes, the above example can be changed to:

[fileset rtl_files]
files =
 rtl/verilog/clkgen.v
 rtl/verilog/orpsoc_top.v
 backend/rtl/verilog/pll.v
 rtl/verilog/wb_intercon.v
 rtl/verilog/wb_intercon_dbg.v
 rtl/verilog/include/or1200_defines.v[is_include_file]
 rtl/verilog/include/orpsoc-defines.v[is_include_file]
 rtl/verilog/wb_intercon.vh[is_include_file]
 rtl/verilog/wb_intercon_dbg.vh[is_include_file]
 sw/clear_r3_and_jump_to_0x100.vh[is_include_file]
 sw/spi_uimage_loader.vh[is_include_file]
file_type = verilogSource
usage = sim synth

[fileset tb_files]
files =
 bench/orpsoc_tb.v
 bench/uart_decoder.v
 bench/spi_image.vh[is_include_file,file_type=verilogSource-2005]
 bench/test-defines.v[is_include_file]
file_type = verilogSource
usage = sim



Note that the file_type attribute for spi_image.vh was only added to show how multiple attributes can be set.

If there is an IP-Xact component file for the core, FuseSoC can parse that for the filesets instead. In that case, the above example will look like this

[main]
component =
 de0_nano.xml

Yep, that's right. We can leave it all to IP-Xact. Except for when we need file types that aren't in the IP-Xact standard (again Verilog 2005 and VHDL 2008 are the primary cases here), and the usage will default to both sim and synth. Future versions of FuseSoC will allow overriding options in the IP-Xact files to handle these cases. The IP-Xact support is described in greater detail in an earlier post


Mixed-language support


With the new file set features in place, most of the simulators and synthesis tools now accept both verilog, systemVerilog and VHDL files simultaneously for mixed-language projects.

 

Compile-time parameter enhancements


Both build backends now support setting top-level verilog parameters at synthesis-time. This can be used for example to initialize a boot ROM with different contents, or setting a version tag at compile-time. Using the de0 nano example once again, the following can be added to the .core file to register a parameter

[parameter bootrom_file]
datatype    = file
description = Boot ROM contents in Verilog hex format
paramtype   = vlogparam
scope       = public


To select a different boot ROM, run fusesoc build de0_nano --bootrom_file=/path/to/bootrom.vh

Slightly related are some enhancements to the simulator backends. Both verilator and XSIM now accepts parameters which are turned into plusargs, and all simulator backends now analyse the CLI arguments before building the simulator model to save some time.

Distutils and pypi


Apparently all the cool kids put their Python code on pypi, and it was suggested that I should do the same with FuseSoC. The easy way to do that is to replace autotools with a Python-based build system called distutils...or setuptools... which at some point gets used by pip... to create an egg or a wheel... which can be installed with pip.... or easy_install. I'm really not sure. Python packaging is a complete mess with around ten competing build systems. Try to Google it if you don't believe me. It's insane. I think this tweet pretty much sums it up.

So what are the benefits of this new Python-based build system. A small reduction in line count of the configuration files and the possibility to upload to pypi. It doesn't handle dependencies unfortunately, and there is no way to uninstall a package. How I love progress! The autotools-based build system is still there, until I figure out what to do.

That's all for now. I have already plenty of things lined up for the FuseSoC 1.5 release


torsdag 14 januari 2016

FuseSoC and IP-Xact

When I started the work on what would become FuseSoC, I had the ambition to somehow take advantage of the IP-Xact standard. Unfortunately my time to work on FuseSoC is limited, and there has been many other features that got higher priority. In the end, it might actually have been a good thing that it wasn't added until now, since it gave me plenty of time to get a feel for IP-Xact and figure out just how to best integrate it with FuseSoC.

Let's begin with a small introduction to IP-Xact.

According to Accellera, who oversees the standard, IP-Xact is "a well-defined XML Schema for meta-data that documents the characteristics of Intellectual Property (IP) required for the automation of the configuration and integration of IP blocks; and to define an Application Programming Interface (API) to make this meta-data directly accessible to automation tools"

So IP-Xact is one or several XML files that you can add to your IP core to describe certain properties of the core. There are tons of features in IP-Xact to handle different aspects of the design. These include things like register maps, external ports, source files, parameters, build commands and much more. IP-Xact files are becoming quite common as a method of encapsulating IP so that they can be integrated more easily in different EDA flows. Many of the EDA vendors are using it, even though they sometimes tend to use so many vendor-specific extensions that the IP in practice are of limited use outside of their own tools.

The observant reader will notice that some of IP-Xact's features are directly overlapping with the FuseSoC .core files. With the premise that double documentation is a bad thing, let's take a look at different options for only having one source for metadata.

  1. Just use FuseSoC .core files. This is what we have today, but the whole idea of the IP-Xact integration  is to expand the FuseSoC universe, take advantage of an existing standard and trying to not reinvent things. Also, this article would have been terribly short if I had decided on this option
  2. Just use IP-Xact files. This is a solid proposition, but my opinion on IP-Xact is that it is a flawed standard.  It's currently the best (only?) chance of a vendor-neutral standard at all though, so I can forgive some of its drawbacks, but not all of them. Some of the more annyoing problems are the file_type parameter that can be attached to each file. The latest IP-Xact standard currently defines 38 language types. Verilog is represented by verilogSource, verilogSource-95 and verilogSource-2001. VHDL has three similar options. C only has cSource and there are a few more generic such as swObject, SDC, unknown and user. The problem here is that this list doesn't stand a chance of keeping up with all the new and existing file types that people wants to use for digital design. It leaves out both VHDL 2008 and Verilog 2005, which makes it harder to decide on which compile flags to use for the EDA tools. It's also not aware of the new school of HDL like Chisel, MyHDL, Migen or Cx to just mention a few (a better solution was suggested here).

    I'm also not completely convinced by the attempt of becoming a build manager à la CMake or Apache ANT. The build management part of IP-Xact already have quite a lot of options, but not nearly enough to be flexible enough to support complex builds. I would rather see that they left this part out to avoid that the standard becomes to unwieldy.

    There are also plenty of options in the .core files that are not in IP-Xact. "A-ha!", says the experienced IP-Xact user here. "But IP-Xact has a built-in mechanism for specifying vendor-specific extension. All FuseSoC nonsense could just be marked as extensions". True, but first of all, that clashes a bit with the idea that FuseSoC shouldn't make unnecessary demands on the IP cores. One thing in particular that would be a bit complicated is how to specify where to find dependencies for a core. All this might be solvable, but I also find IP-Xact a bit too heavy and at the same time not specific enough compared to the existing .core files for some parts.

  3. My preferred option is instead to keep the .core files, but allow specifying an accompanying IP-Xact file from where FuseSoC can get additional metadata. This allows us to work with cores both with and without IP-Xact files and we don't have to make any changes to the upstream cores. We can avoid double documentation and add things to the .core files that are hard to do with IP-Xact. Win, win, win, win!

    To further motivate this decision, I'm drawing on the experience of software package managers here. To make a comparasion with the software world, IP-Xact is our AppStream file, that contains metadata for applications, that can be used in different package managers. The program itself is still put in a .deb or a .rpm file or specified by a .ebuild or pkgbuild file (the .core file). These are different layers that have some overlapping information.

For FuseSoC I'm currently using the file sets and core description, since these are options that are already available in FuseSoC and can be easily fetched from the IP-Xact file instead. There are other things in IP-Xact that could be used in the future as well, but we all know that the future is scary, so let's not talk about that.

To make FuseSoC aware of a core's IP-Xact component file, this file has to be added to the main section with the new option 'component=<file.xml>'. When the core is loaded, the file sets in the IP-Xact file are parsed and added to the list of filesets found in the .core file. There are some extra features on the roadmap here to make it possible to merge or replace filesets with similar names in both the .core and the IP-Xact file. Merging filesets can be useful as some file types (such as verilogSource-2005 and vhdlSource-2008) and other options are only available when the filesets are specified in the .core file. The group tag from the IP-Xact should probably be merged with the usage tag in the .core file as well at some point.

Note also that the  component option doesn't have to be a single file, but can also be a space-separated list of files. In this case, file sets are appended, but the description is taken from the first file that sets it.


The first IP-Xact features will be available in FuseSoC 1.4, but they are already available in git, and as a bonus, I have also put together a Proof of Concept system (a PoC SoC :)) , together with some instructions here, that uses the new features. It's a stripped down version of the de0 nano system that is available in orpsoc-cores and can be both simulated and built to an FPGA bitstream.


For those of you who haven't used FuseSoC before, you can find installation instructions in the FuseSoC repo

After installation, add the path to de0_nano_ipxact to your core library path and run
fusesoc build de0_nano_ipxact to build an FPGA bitstream or fusesoc sim de0_nano_ipxact to run a simulation in Icarus Verilog

Have fun and let me know what's good and what can be improved

lördag 19 december 2015

FuseSoC and VUnit

I recently improved the VHDL support in FuseSoC and since I've been using vunit a bit lately I thought it could be a fun experiment to see if I could combine the strengths of these projects.

For those not aware, FuseSoC is a package manager and build system for FPGA/ASIC systems. It handles dependencies between IP cores and has functionality to build FPGA images or run testbenches a single core or a complete SoC. Each core has a .core file to describe which source files it contains, which dependencies it has on other cores and lots of other things. The aim of FuseSoC is to make it easier to reuse components and create SoCs that can target different FPGA vendors and technologies. FuseSoC is Open Souce, written in Python and can be found at https://github.com/olofk/fusesoc

VUnit is an open source unit testing framework for VHDL released under the terms of Mozilla Public License, v. 2.0. It features the functionality needed to realize continuous and automated testing of your VHDL code. VUnit doesn't replace but rather complements traditional testing methodologies by supporting a "test early and often" approach through automation.

VUnit has a lot of great functionality for writing unit tests in VHDL, but requires the users to set up the source tree with all dependecies and their libraries themselves

FuseSoC on the other hand has knowledge of each cores files and dependencies, but very little convenience functions for writing unit tests. The ones that are existing are mainly targeting verilog users.

Given these preconditions, my idea was to let FuseSoC collect all source code and give it to VUnit to run unit tests on them.

Let's get started

VUnit requires the user to write a small Python script that sets up simulation settings, collects all source files, put them into libraries and starts an external simulator. This script is then launched from the command line with options to decide which unit tests to run, which simulator to use and where the output should go among other things. Here's the example script from VUnit's user guide
 
# file: run.py
from vunit import VUnit

# Create VUnit instance by parsing command line arguments
vu = VUnit.from_argv()

# Create library 'lib'
lib = vu.add_library("lib")

# Add all files ending in .vhd in current working directory
lib.add_source_files("*.vhd")

# Run vunit function
vu.main()

FuseSoC is also launched from the command line and expects to be instructed which core or system to use, if it should do synthesis+P&R or run a simulator and other options. FuseSoC however was always meant to be used as a library as well as a command-line tool, so to make these work together, we create a VUnit run script that imports the necessary functions from FuseSoC. Thankfully both tools are written in Python, or I would have given up at this point.

The first inconvenient difference between FuseSoC and VUnit is that the vunit run script need all source files that should be compiled for any testbench to run. FuseSoC on the other hand don't know which source files to use until we tell it which core to use as it's top-level core. To work around this I decided to look at the VUnit's -o option, which is used to tell VUnit which output directory to use. I simply peek at the output directory and use that as the FuseSoC top-level core name. We now got the first lines of the new script.

from vunit import VUnit
 
vu = VUnit.from_argv()
 
top_core = os.path.basename(vu._output_path)

Now we need to do some basic FuseSoC initialization. First we create a core manager, which is a singleton instance that is handling the database of cores and their dependencies.

from fusesoc.coremanager import CoreManager, DependencyError
cm = CoreManager()

Next step is to register a core library in the core manager. Normally FuseSoC picks up locations of core libraries from the fusesoc.conf file, which can be in the current directory or ~/.config/fusesoc, or from the --cores-root=/path/to/library command-line options.

We don't have any command-line options for this, but we can get fusesoc.conf by using the Config() singleton class.

from fusesoc.config import Config
config = Config()
cm.add_cores_root(config.cores_root)

We can also add any known directories directly with

cm.add_cores_root("/path/to/corelibrary")

The core manager will scan the added directories recursively and pick up any FuseSoC .core files it finds. (Note: If a .core file is found in a directory, its subdirectories will not be searched for other .core files).

It's now time to sort out the dependency chain of the top-level core we requested earlier

try:
    cores = cm.get_depends(top_core)
except DependencyError as e:
    print("'{}' or any of its dependencies requires '{}', but this core was not found".format(top_core, e.value))
    exit(1)

If a dependency is missing, we tell the user and exit. If all was well, we now have a sorted list of cores in the 'cores' variable. Each element is a FuseSoC Core class that contains all necessary information about the core.

With all the cores found, we can now start iterating over them in order to get all the source files and other information we need and hand it over to VUnit. Some notes to the code below:

1. 'usage' is a list of tags to look for in the filesets. Only look at filesets where any of these tags are present. FuseSoC itself looks for the names 'sim' and 'synth' to indicate if the files should be used for simulation and synthesis. We can also choose to only use a fileset with a certain tool by for example setting the tag 'modelsim' or 'icarus' instead of 'sim'
2. File sets in FuseSoC can be public or private. Default is public and indicates that other cores might find the files in there useful. This is for example the files for synthesis and testbench helper functions such as BFMs or bus monitors. Private filesets are used for things like core-specifc testbenches and target-specific top-level files.
3. Even though the most common way to use libraries is to have one library for each core, there's nothing stopping us from splitting a library into several FuseSoC cores, or let one core contain multiple libraries.

usage = ['sim']
libs = OrderedDict()
for core_name in cores:
    core = cm.get_core(core_name)
    core.setup()
    basepath = core.files_root
    for fs in core.file_sets:
        if (set(fs.usage) & set(usage)) and ((core_name == top_core) or not fs.private):
            for file in fs.file:
                if file.is_include_file:
                    #TODO: incdirs not used right now
                    incdirs.add(os.path.join(basepath, os.path.dirname(file.name)))
                else:
                    try:
                        vu.library(file.logical_name)
                    except KeyError:
                        vu.add_library(file.logical_name)
                    vu.add_source_file(os.path.join(basepath, file.name), file.logical_name)


With that we are done, and can safely leave it to VUnit to do the rest

vu.main()

It all works fine, and to make it more interactive, I set up a demo project at https://github.com/olofk/fusesoc_vunit_demo. This contains a packet generator for a made-up frame format together with a testbench. The packet generator has dependencies on two utility libraries (libstorage and libaxis) that I wrote a while ago to handle some common tasks that I got fed up with rewriting everytime I started a new project. The packet generator test bench uses some VUnit functions to make the example a bit more educational.


I hope this will be useful for all people using, or thinking of using, VUnit. For all you others, you can still use FuseSoC without VUnit for simulating and building systems and cores. The FuseSoC standard core library that can be downloaded as part of the FuseSoC installation contains about 60 cores, and there are several more core libraries on the internet that can be combined with this.

Happy hacking!

söndag 11 maj 2014

OpenRISC health report April 2014

About once a year, I'm at some conference to present the OpenRISC project for a new audience. One part of the presentation is often a quick rundown of all the things that has happened since the year before. Problem is that there's just too much stuff going on as the project continues to grow. To remedy this, I'm planning to write a small summary every month, or every quarter (time will tell), both to remind myself of what we are doing, and to make it more visible to the casual observer that there is pretty cool stuff going on with this nifty little CPU and the things around it. As this is the first installment, I would really have to present everything that has happened over the last 15 years to give a complete picture. I will not do that. You will have to settle with a back log from last year's OpenRISC conference.

While writing this article, I had to constantly ask for help explaining some of the concepts. This once again showed me how large this project has become and how much in-depth expertise that is shared by the participants. It also shows how fast things are moving, since most of the above items are things that has happened only during the last six months! I probably need to write a bit more often to avoid having to write a novel each time, but enough introduction. I will now leave the stage to OpenRISC and it's friends...

Debian for OpenRISC 


Perhaps the news item that has had the most coverage is the work done mainly by Christian Svensson to create a Debian port for OpenRISC. While OpenRISC, or rather the or1k architecture to be more precise, has been a supported target in the official Linux kernel since version 3.1, users have had very little support for creating a complete OpenRISC-based system. With the Debian port, things should be much simpler, and once the basic system is installed, your favorite package is just an apt-get install away. Naturally all packages aren't supported yet, and the package creation process has uncovered several bugs and missing features in the OpenRISC toolchain. The upside of this is that most of these bugs were fixed extremely fast, mostly by Stefan Kristiansson and Christian Svensson. As of now roughly 5000 packages have been built and packaged so far, thanks to Manuel A. Fernandez Montecelo. Many of the packages have been tested both in qemu and on a Digilent Atlys board with mor1kx. Apart from a multitude of library dependencies, the package list includes X.org, qt, scummvm, irssi and Fluxbox which means that you can now play games and chat somewhat comfortably on an OpenRISC system.

More information can be found here


Atomic operations


When running multiple cores, or multiple threads on the same CPU, each thread or core needs to know if someone else is currently trying to access an exclusive resource, such as an UART or a part of a shared memory. Normally this is done in software with mutex operation, such as semaphores or spinlocks, and the requirement for implementing a mutex is that an mutex operation is never allowed to be interrupted. Previously on OpenRISC this was done by making a syscall that disabled all interrupts as it's first instructions. That way it could make sure it wasn't interrupted by something else while it was busy doing it's work. Unfortunately this method is quite slow, and just doing ls generates  about 1700 system calls of which roughly 90% are mutex operations. Also, the syscall method won't work at all for multiple cores.

To improve the situation, Stefan Kristiansson has added two completely new instructions to the OpenRISC instruction called l.lwa (Load Word Atomic) and l.swa (Store Word Atomic). These two instructions form a load-link/store-conditional atomic operation, which can be read about in further detail on Wikipedia. Apart from the updates to the architecture specification, Stefan has implemented the operations in cgen CPU description in binutils so that the toolchain knows about them and can use them, the instruction set simulator or1ksim and in the RTL implementation mor1kx together with test cases.

An interesting side note is that while no OpenRISC implementation has ever had atomic operations before, Stefan found mentions of a pair of operations in a very early version of the architecture specification, and decided to reuse these names.

Multicore


OpenRISC has traditionally been used as a single core CPU in a SoC, but there is an increasing demand for spreading workloads over several cores in a system. Of course, it has always been possible to instantiate as many OpenRISC cores as you can fit in an FPGA or ASIC. The problems arise when they need to cooperate to share resources like main memory or peripherals (exclusive accesses), spread a workload evenly between them (load balancing) and making sure that their local caches are kept in sync with each others (cache coherency).

For several years, Stefan Wallentowitz has been working on the necessary hardware and software additions required for multicore OpenRISC as part of his research project OpTiMSoC. A few weeks ago he made an announcement that there now was a multicore OpenRISC demo SoC based on mor1kx running under FuseSoC. The required changes have been added to mor1kx, and apart from a special version of the newlib toolchain it uses unmodified versions of the software and hardware IP cores.

FuseSoC


During the last two or three years I have been working on a successor to the OpenRISC Reference Platform System on Chip (ORPSoC) that is used to run RTL simulations and provide a base for porting an OpenRISC SoC to a new FPGA board. The project was called ORPSoCv3 (to replace the old ORPSoCv2) and I did three releases before swiftly renaming it to FuseSoC. Why? you ask. Well, it turned out that there wasn't a single line of OpenRISC-specific code in FuseSoC, and I want it to be used as a general-purpose tool for building and simulating cores and SoCs for FPGA. This has become reality as it is now used for a SoC based on the eco32 CPU in addition to being the OpenRISC standard tool and other Open Source projects have shown interest as well.

Apart from the renaming and the never ending bug fixing, FuseSoC has gained many new features over the last six months. Some of the highlights include support for using SystemC in Verilator test benches, VHDL and using ISE to create images for Xilinx FPGAs. More detail can be found in the FuseSoC repository on github. The number of contributors continue to increase for every release which is a good indication that it is used by many people... or at least that many people find bugs.

ORPSoC Cores


The other part of the FuseSoC infrastructure is the collection of cores and systems that can be utilized by FuseSoC. The main repository is called orpsoc-cores and contains ready-made OpenRISC SoCs for different FPGA boards as well as the cores that are used on the boards.

Since October last year there has been added support for the very popular Digilent Atlys board, with ports for the LX9 Microboard, DE2 70, SoCKIT, NEEK and ordb2a in the works.

Many new cores have been added as well, and there is currently 39 supported cores in total. For more details see the ORPSoC Cores repository on github.

Binutils


The de-facto standard open source toolchain for creating programs consist of two parts. The compiler (GCC, with LLVM getting more common as well) and binutils. Binutils can be summarized as everything but the compiler. This means the linker, assembler, archiver and tools for modifying ELF files. OpenRISC support in binutils was added 14 years ago, but since then, no improvements have been sent upstream.

There has been a huge amount of work over the last years to improve the OpenRISC binutils support, and earlier this year it was decided that these changes should be sent upstream. Having them in the official binutils distribution means that anyone who wants to make binutils for OpenRISC can now use the official binutils package.

In the OpenRISC case, sending the changes upstream wasn't completely trivial even though the code itself was in very good shape, as FSF (Free Software Foundation) who owns binutils need to have the written agreement of all contributors that they are OK with giving away to copyright of the code to FSF. That means that the first thing that had to be done was to go through all changes and hunt down all the people who have contributed.

I volunteered to take on this mind-numbing job, which wasn't made any easier as we switched VCS from SVN on OpenCores to GIT on github a few years ago, and renamed the architecture from or32 to or1k. Luckily, most people had been very good with updating changelogs, which made things a lot easier. In the meantime, Christian Svensson had split up the OpenRISC-specific changes to a set of patches that was sent in for review. When that was done, all contributors had to get a copyright assignment document mailed from FSF to sign and send back. As the patches were already reviewed, the rest of the process went quickly by when all the mails had been sent back.

Stefan Kristiansson and Christian Svensson are now the proud caretakers of the OpenRISC port which should be included in the next binutils release.

musl


While there are many things that most people agree on, there seem to be no such consensus when it comes to standard C libraries. Everyone seems to have their own specific use case that requires a brand new C library. OpenRISC already has support for glibc, uClibc and newlib, but the new kid on the block when it comes to embedded software is musl. musl is designed to be small and portable which suits the OpenRISC use case just fine. Stefan Kristiansson has prepared a musl port which is mostly done apart from some details about syscall handling that needs to be resolved first. Now we only need to port bionic and we're ready to take on Android.

mor1kx


The up-and-coming mor1kx OpenRISC RTL implementation is rapidly gaining new features, perhaps as a testament to it's more modular design compared to the original or1200 implementation. In addition to the atomic operations mentioned above and multiple bug fixes it now supports caches with more than two ways,  and the ability to reading the GPRs through the debug port.

ORconf 2014


Last year's OpenRISC conference in Cambridge, England was a great success and attracted participants from both academia and the industry. As for this year, Stefan Wallentowitz has kindly offered to host the conference at Munich University. Initial planning has started and nothing is set in stone yet, but it will probably be carried out the weekend October 11-12. More information will become available at http://orconf.org over the coming months.

onsdag 9 oktober 2013

orconf2013; Reports from the yearly OpenRISC conference

Dear diary,

I just got back from the yearly OpenRISC conference, orconf, which this year was situated at the university in Cambridge UK.

TL;DR;
The conference was a great success! The flight there was not.

After forcing myself out of bed early on Saturday morning, I spent half the day in a barn on the outskirts of Göteborg, since a broken radio transmitter delayed the flight over three hours. That was enough to make me miss out the introduction talk as well as the presentations from David Greaves on a SystemC model of a multicore OpenRISC with power annotation, Stefan Wallentowitz on OpTiMSoC and Jonathan Woodruff on BERI: A Bluespec Extensible RISC Implementation. Thankfully, all talks were recorded and will be available online when they have been transcoded.

The delayed flight also made me miss the first minutes of my own presentation on ORPSoCv3, which served as an introduction to the workshop later in the afternoon. Fortunately, I had plenty of time allocated to my short introduction on the subject. The presentation was actually a bit shorter than what I had originally planned since there has been way too many other things to do lately, and I hope to have time to make a more complete introduction for next time.

After the presentations, the conference went on with a workshop based around getting OpenRISC running Linux on a DE0 nano board. Embecosm kindly provided enough hardware to let all participants team up in groups and have their own board to play with. Slightly different incarnations of this workshop has been presented by Embecosm and Julius Baxter at other events such as OSHUG, but for me it was a milestone since we decided quite recently to base the hardware workflow on ORPSoCv3, which is what I have been working on for the last few years.

It all went surprisingly well. Except for a participant who had problems with his python version, I didn't hear many complaints. Most of the credit will have to go to Julius Baxter for preparing and writing down precise instructions on how to get started, Stefan Kristiansson who prepared a precompiled OpenRISC tool chain and did a ORPSoCv3 port for de0 Nano (in 30 minutes!) and Franck Jullien who's work on OpenOCD have made debugging easier as well as providing the first ORPSoCv3 board port which was a proof that it actually worked; but it made me very happy as well, since it shows that the time I have spent on ORPSoCv3 has paid off. It's now a little easier to build and simulate an OpenRISC based SoC.

Since I don't have a DE0 nano board myself, I spent the time trying to make a port for my trusty ordb2a board based on the de0 nano port instead. It was actually my first own board port, and the process went well, but when the time came to program the board and connect to it, it turned out my development environment was not up the task, and I had to spend most of the time compiling debug software and hunting down patches. The outcome is that I now have a patch for OpenOCD to make it work properly with the ordb2a boards that I should send upstream soon.

The first day at the conference ended after the workshop, but as usual with these events, the fun doesn't stop when we leave the conference building. We all went for a fantastic dinner at St John's Chophouse where we continued to talk about everything from switching characteristics of crypto algorithms to high-level HDL languages. A few of us went on to try out some of the fine establishements in Cambridge afterwards and stopped by a pub where you could pay your drinks with bit coins. After the initial amazement of being able to pay things in real life with bit coins, this started another discussion on critical paths and process nodes for Bitcoin mining ASICs. It's funny how all conversations seem to end up in that direction.

After some well-deserved sleep, the sunday started of with Stefan Kristiansson and Julius Baxter presenting the latest improvements on mor1kx. Stefan continued the great tradition from last year of being the one who provide us with eye candy. This year he showed us Day of the Tentacle running under SCUMMVM via libSDL and glxgears running under X. These are two things that wouldn't have been possible if not for the great work on the toolchain during the last year by Sebastian Macke, Stefan Kristiansson and probably others who I probably should mention but can't remember right now. mor1kx itself has grown into a quite mature CPU now with three different pipeline implementations to cover the range from running a full-blown Linux system down to deeply embedded bare-metal applications. It has also found it's use in OpTiMSoC and will probably be the default or1k implementation in ORPSoCv3 in the near future.

Next in line was Martin Schulze who grew frustrated on the tedious work of setting up an ORPSoCv2 port for a new board and wrote a configuration system that's hooked up to Eclipse. While it might sound as conflicting with the work on ORPSoCv3, it will actually be a great combination instead, and I look forward to integrate the two efforts to make new board ports even more painless.

Last year we had a presentation where the first OpenRISC in space was unveiled, and the space theme continued in Guillaume Remberts presentation OpenRISC for space applications and EurySpace SoC. Guillaume showed clearly the advantages of using OpenRISC in an environment where failure is not an option, as it's flexible, battle proven and not tied to a single vendor.

After a well-deserved coffee break, Franck Jullien took the stage and showed the new and improved GDB. Together with his work on OpenOCD that was just accepted into upstream about a week ago, the debugging support for OpenRISC is in better shape than ever.

Last of the major presentations was Davide Rossi who talked about his group's research on extremely low power ASIC SoCs, where they had used multiple or1200 in a 28nm ASIC that will likely tape out later this year. During their work with or1200 they made several improvements to the critical path that might find it's way into the upstream or1200 repo. As there was a room full of OpenRISC experts we shared ideas on how to verify the ASIC once it was completed.

The day went on with a discussion on the feasability of an OpenRISC 1000 successor.  The idea started out a few years ago as an way to rectify some of the deficiencies of the or1k ISA, and some of the thoughts that has come up over the years are summarized on the OpenCores or2k page. The general opinion this year however was that with the current number of contributors to the OpenRISC project, it would be too much work to start a new implementation. If this was taken up as a student project or as a research project we would embrace it with open arms, so all academics reading this, please come forward.

One of the interesting things when working on a smaller architecture is that you some times realize how much the design choices of the large players are taken for granted. Sebastian Macke discovered that the way our ABI is defined differs in some regards from how ARM and Intel does things. This is usually not a problem, but combined with how some programs break the C standard for varargs in a subtle way, we have problems that only manifests itself on some architectures. Tricky! The discussion was to decide if we should change our ABI to avoid this problem, but decided against it for now to see how widespread the problem is.

Following the ABI discussion was two short lightning talks. One by Julius Baxter to show jor1k, the JavaScript OpenRISC emulator by Sebastian Macke, as it had been referred to multiple times during the conference and also to mention that most of us could be found on the #openrisc channel on irc.freenode.net. The other talk was by Jeremy Bennett on the latest improvements for Verilator. Apparently verilator now supports most of the synthesizable sub set of System Verilog which is getting more and more important as the rest of the world is moving in that direction. This means that we can continue to use this fantastic tool in modern development environments, and a few high-profile projects where verilator is used was also mentioned.

To finish it all off, we had the yearly bug squashing session. Out of 71 open bugs, we were able to mark nine bugs as invalid or already fixed. Eight other bugs were repreoritized or reassigned to the right person, and I managed to fix one RTL bug in or1200 on the flight back home. In total that means we could close around 12% of our open bugs with little effort. Responding to bugs that has been reported in a timely manner is extremely important. We have previously seen potential contributors who have lost interest after seeing that their reports go unnoticed, so in addition to the yearly session, I hope that everyone goes through bugzilla from time to time to find things that can be fixed.

Flight back home was delayed once again, but we managed to catch up enough so that Ryanair could play their we're-on-time-fanfare, and once home I slept like a baby. Unfortunately, my baby didn't

In retrospect, a few interesting observations could be made from the topics of the talks. It seems that multicore OpenRISC was a large part of the presentations this year for different reasons. Stefan Wallentowitz focus was on many-core implementations, while Guillaume Rembert needed it for redundancy in fail-safe applications and Davide Rossi needed it to try out different levels of power optimizations. Hopefully, the combined work might lead things forward and having a conference like this is a great way to make people aware of each others' work. The other thing, that has also been my pet peeve and the reason for starting work on ORPSoCv3 is the need for easier system generation and interconnections.This is being addressed now, and there seem to be a lot of interest in helping out in this area.

I enjoyed the little I had time to see of Cambridge, and it's a fantastic feeling to be in the company of so many talanted persons, both from academia and industry. We were approximately twice as many as last year which also goes to show that the project has a healthy growth. This is also seen by the influx of contributors, many of who unfortunately weren't able to join us. Hopefully, we will meet next year, or at some other occasion. I hope that everyone views the recorded presentations when they become available on the conference page, as this short summary doesn't do justice to all the great talks.

Finally, special thanks should go to Julius Baxter for once again organizing the conference and making sure it's got off the ground, to the whole of Embecosm for sponsoring the event and specifically Jeremy Bennett for rounding up a few of the speakers that we otherwise would have missed out on. Also thank you to David Greaves and University of Cambridge for hosting us and taking care of thirty meek geeks during the weekend.


Hope to see you all again!

tisdag 4 juni 2013

initial begin



Software is built up by layers and layers of abstractions, which has the pleasant effect of hiding all the underlying madness that computer software is built upon. If we still would be fearless enough to take the long and winding road down a user facing application, through the libraries, operating system and drivers, we would eventually end up at the register map. This is an impenetrable wall even for most of the hardcore close-to-the-metal low-level-driver gurus. This is the boundrary between the chips and the instructions that are meant to make the chips do something useful. This is however not the final frontier. Some fearless souls, calling themselves FPGA/ASIC designers, Digital Design Engineers or Hardware Developers have decided to take the spirit of Open Source and Free Software beyond the wall and into the realms of the silicon. In the Open Source world, hardware is generally considered Open Source friendly when there exists a documented register map and perhaps some use cases, so why do more? Well, for us on the hardware side, we want a little more than that. Hardware can be just as buggy as software, and working with closed-source IP cores are generally a pain. This ranges from the obvious problems that you can't debug your code properly to the insane license agreements and in many cases strange restrictions that the license holder can impose on the development environment. While these are all very practical reasons for doing all this, I think that the driving force for most of us is that it is pretty damn fun!

I, myself, come from a background of having worked professionally with FPGA and ASIC as a consultant for five years now, and been involved with software and hardware since my first stumbling peeks and pokes in QuickBasic in the mid-nineties. Over the years, I have enjoyed the great success of Open Source Software and contributed back to a bunch of different projects. In 2010, I got involved in the OpenRISC project, through my employer at that time. These last three years working on the OpenRISC project in my spare time, I have found myself in company of extremely talented people who are not afraid of learning new things or take on overwhelming challenges. Combined with a large ecosystem of IP cores, toolchains, operating systems, test suites, documentation, development tools and basically everything else you expect from a processor architecture, this has come to be a huge and very interesting project.

The main reason for starting this blog is to shed some light on all the things we are working in the digital domain of the Open Source ecosystem, and in particular the OpenRISC project. Searching around the internet, it turns out that the amount of written information is sparse. Apart from our ever-growing official OpenRISC page that holds our ever-growing wiki and links to most things related, Sven Andersson has written a series of articles about his experience with the OpenRISC as a newcomer to the scene. This has provided us with great feedback for how we can improve and streamline things, as well as getting many people interested in trying out this mythical CPU creature. Franck Jullien has also written some articles on his blog about his work with some cool debugging features for OpenRISC. Another resources that has served well over the years are a series of application notes from Embecosm on SystemC modelling, GDB protocol implementation and C libraries, written by long-time contributor Jeremy Bennett. Many of those articles has found use also outside of the OpenRISC world. There are probably many more articles that I have forgotten about, and it would be great if everyone who feel left out can come forward so that we can collect a definite list of all related articles. In addition to written articles, many of us like to show up at conferences to talk about what we do in the project, hang out and hack on things. Julius Baxter and I have held presentations (here and here) and a workshop at FSCONS and visited FOSDEM last year. 2012 was also the year that we held the first (hopefully annual) OpenRISC Conference, which was a great success. Sitting through the presentations, it was clear that the project is growing and that an impressive amount of work is being done by extremely talented hackers.

While all of these articles and presentations has given us a chance to show what we are doing, there are still a lot of really cool development that goes unnoticed for the unitiated. I hope to get some time to write about all the fun stuff that is going on, and most of all, write about all the fun stuff that I am doing.


Most of my own work isn't directly related to the OpenRISC CPU itself, but on the infrastructure that surrounds it. I'm maintaining an ethernet MAC, has started an extremely low-volume mailing list on the Wishbone bus (which is the bus that OpenRISC as well as a few other Open Source CPU:s use to connect to other IP cores), done some C library hacking, fixing bugs, reminding other people about forgotten patches and fixed bugs and been around to companies, conferences and schools to talk about using Open Source IP cores in their workflow. Since about two years my main focus has been on creating a platform for simulating individual cores as well as simulating and building systems based on the many fine IP cores that are available at OpenCores and other places. The project is called orpsocv3, or the OpenRISC Reference Platform System on Chip version 3, and is planned to replace our current platform orpsocv2. orpsocv3 is packed with functionality that will hopefully make development easier compared to the existing system, and will hopefully be prominently featured in coming articles. The code is still in an early development stage, but you can check it out here if you want to to see for yourself what all the cool kids are talking about. To avoid the infamous scope creep, I'm resisting the urge to write more about orpsocv3 now and instead conclude that proper introductions have been made and we can go on to the technical stuff for the next post.
end

måndag 27 maj 2013

Scope Creep


During my work on ORPSoCv3, I realized that there were some problems with the SPI Flash memory model that I planned to reuse from ORPSoCv2. To use the file (and more importantly to make it publicly available) it turned out I needed a written license agreement. I now had three choices:

1. Pretending like I didn't see the license - Not gonna happen. I'm building a product intended for both hobbyist and commercial use, so I want these things to be done correctly, or I will risk that they come back and haunt me.

2. Try to contact the right people to ask for a relicensed version of the file. It could be worth trying, as there is at least a company name and the author's name in the file header that I could use to track down the owner. On the other hand, the last update was in 2007 and even if I would find someone who could claim ownership, it's not very certain that they would allow relicensing the file.

3. Rewrite the code. Even though I have tried hard to not reinvent too many wheels in ORPSoCv3, this is sometimes the easiest way forward.


The third option, which is what I chose, has some added benefits too. The SPI flash model in ORPSoCv2 is targeted for a specific flash that we used on an old board that I don't intend to support in v3, so the new code could probably be made a bit more generic instead. Instead of a monolithic BFM for a chip, I could create a generic SPI BFM (which I'm sure would come in handy in other cases) and try out an idea I've been having about a memory that uses VPI callbacks to store data in dynamically created C structures instead.

This is where the scope creep started...

First of all, the idea of having a dynamically created memory instead of a huge twodimensional array comes from the assumption that in some cases, we need a large memory map, but we only read and write to the small parts of that memory. Having a huge static verilog array means we have to allocate a lot of memory that will never be used, and the simulator might need to check for events on every bit, which could slow things down considerably. (Note here that I'm using might and could, since I haven't done any benchmarks to prove anything yet. It could also be very simulator dependent). So the idea here is to make an array of pointers to memory segments and only malloc a page when it is written to for the first time. The verilog code would have $read and $write functions that can read from memory and some init functions to preload images (elf, exo, bin, vmem, srec and whatever could be useful).

I started hacking on the C code, and it seemed reasonably simple to do this. To try it out, I wanted some real world testing, but unfortunately I didn't have any nice SPI test cases (nor a SPI BFM). The VPI memory backend however would be a good match for a simple model of a Wishbone SDRAM controller . If I could replace the existing controller (Scope Creep), I would be able to run all the existing ORPSoCv2 test cases to get some verification. Looking at the code for the existing model, it turned out to be a little awkward to fit the $read and $write commands in to the existing Wishbone memory model, so I decided to find a proper Wishbone BFM that I could use to interface with the memory backend instead (Scope Creep).

Searching around the internet I found a few commercial ones, a nice looking one in System Verilog, a VHDL version and an implementation in the ethmac testbench (an IP that I also happen to be co-maintainer of).

None of the existing models were up to the task however as I need it to be pure verilog and support Wishbone B3 burst transactions. The first requirement is because neither Icarus Verilog or the free Altera-provided Modelsim supports mixed language simulation, and everything I have is in verilog at the moment. Only the BFM from ethmac fulfilled the first requirement, but unfortunately it didn't seem to support burst transactions. The only thing left to do is to write a brand new verilog based Wishbone slave BFM myself (Scope Creep). I figured that would come in handy anyway, so it wasn't that much of a problem.

Implementing a simple BFM wasn't too much work, but it turns out that having a three weeks old baby and a full-time employement makes it a bit harder to find spare time to work on side projects like this. It also took a while longer since I at some point decided to refactor some things in ORPSoCv3 (Scope Creep),  make a Wishbone interconnect generator to make it easier to hook up the memory model to a CPU (Score Creep), and to test the inteconnect I would need to build a Wishbone Master BFM too (Scope Creep). The interconnect part was eventually dropped, but I did implement a basic Wishbone Master BFM. I also decided to start with a static memory before I tried out the VPI backed memory model.

Having come this far I thought it would be fun to tell the world about my new BFM once it was done, show of the capabilities of ORPSoCv3 and provide some cool benchmarks, and since I'm about ten years behind the rest of the world, I decided it was now time to start my first blog (Major Scope Creep!). Starting a new blog leads to some big decisions. I would either need to use an available web service to host my blog or do it myself. I decided to host it on my NAS since there is already a Wordpress package available for it. After installing the package and fiddling with MySQL permissions (Scope Creep) I realized that I needed a name for the blog, so I took a few days to come up with a number of witty names (Scope Creep), and abandoning them one after one as it turns out that all witty names for a FPGA/Hardware/Open Source/electronics blogs are already taken (stupid internet!).

About this time I also realized that I didn't really want to host my own blog since it probably meant that I had to tighten up the security of my home network and think a bit about how to provide decent availability, so the focus shifted to finding a good platform to use instead. This left me again with several choices, and the only thing to do was to read up on all the pros and cons of available services (Scope Creep). I started registering an account on wordpress, but had to stop when I realized that I still didn't have a name for the blog.

By now I was starting to get really tired of my own inability to finalize things, so once I came up with a name for the blog, I registered a blogspot account instead and started writing. The basic functionality for the BFM is mostly finished now, but I to avoid further delays in putting this first post out, I'll finish it and push it to the ORPSoCv3 repo after I'm done writing. The slave seems to have identical timing to the model I'm replacing, so I'll consider that good enough for now.

I'm fully aware that this first post need a lot of background informaton to make sense for people outside of the inner-most circle in the OpenRISC project, and that I should have written introductions about myself, the OpenRISC project, ORPSoC, Wishbone, BFM:s Open Source Hardware and whatever else I've benn talking about. Well, to relate to the title, I had to stop this scope creep before I totally lost control. So to finish things up and provide pointers about what I might write about in the coming articles, here's a summary of the work that led up to this article.

Initial target: Add an SPI memory model to ORPSoCv3
Work started: VPI backend for a dynamic memory model, Test bench for Wishbone interconnect, BFM-based static memory model
Work done: Basic Wishbone Master/Slave BFM, Choose a blog platform, Write a first post
Work planned: SPI master/slave BFM, Wishbone Interconnect Generator, Finalize, document and release Wishbone BFM, continue working on rough edges in ORPSoCv3
Work dropped: Hosting a blog, secure the home network, integrate existing Wishbone BFM

As you can see, I'm nowhere near starting on the SPI flash memory model, but at least I have a blog and a partially working Wishbone BFM! :)