Creating a Base System for the Zynq in Vivado

July 31, 2014, 8:40 am

≫ Next: Version control for Vivado projects

Tutorial Overview

In the ISE/EDK tools, we’d use the Base System Builder to generate a base project for a particular hardware platform. Now with Vivado, the process is a little different but we have more control in how things are setup and we still benefit from some powerful automation features. In this tutorial we’ll create a base design for the Zynq in Vivado and we’ll use the MicroZed board as the hardware platform.

Requirements

Before following this tutorial, you’ll need the following:

Vivado 2014.2
MicroZed
Platform Cable USB II (or equivalent JTAG programmer)

Create a new Vivado project

Follow these steps to create a new project in Vivado:

Open Vivado. From the welcome screen, click “Create New Project”.
Specify a folder for the project. I’ve created a folder named “microzed_custom_ip”. Click “Next”.
For the Project Type window, choose “RTL Project”. Click “Next”.
For the Add Sources window, click “Next”. We will add our multiplier source code later.
For the Add Existing IP window, click “Next”.
For the Add Constraints window, click “Next”.
For the Default Part window, select the “MicroZed Board” and click “Next”.
Click “Finish” to complete the new project wizard.

Setup the Zynq PS

The new Vivado project starts off blank, so to create a functional base design, we need to at least add the Zynq PS (processor system) and make the minimal required connections. Follow these steps to add the PS to the project:

From the Vivado Flow Navigator, click “Create Block Design”.
Specify a name for the block design. Let’s go with the default “design_1” and leave it local to the project. Click “OK”.
In the Block Design Diagram, you will see a message that says “This design is empty. To get started, Add IP from the catalog.”. Follow this advice by clicking on the blue “Add IP” link, or by using the “Add IP” icon.
The IP catalog should appear. Go to the end of the list and double click on the block named “ZYNQ7 Processing System” – it should be the second last on the list. Vivado will now add the PS to the block diagram.
In the Block Design Diagram, you will see a message that says “Designer Assistance available. Run Block Automation”. Click on the “Run Block Automation” link and select “processing_system7_0” from the drop-down menu. Block Automation makes connections and pin assignments to external hardware such as the DDR and fixed IO. It does this using the board definition of the hardware platform you specified when you created the project (MicroZed). We could make these connections ourselves if we were using a custom board, but for off-the-shelf boards, Block Automation makes the process a lot easier.
In the Block Automation window, make sure “Apply Board Preset” is ticked and click “OK”.
Now our block diagram has changed and we can see that the DDR and FIXED_IO are connected externally. Now the only remaining connection to make is the clock that we will use for the AXI buses. We must configure the Zynq to generate a clock and enable a general purpose AXI bus. To make these settings, double click on the Zynq PS block.
The Re-customize IP window will open. From the Page Navigator, select “Clock Configuration” and open the “PL Fabric Clocks” tree.
Make sure that the FCLK_CLK0 is enabled (ticked) and that it is set for a frequency of 100MHz. This will be our AXI clock.
Now from the Page Navigator, select “PS-PL Configuration” and open the “GP Master AXI Interface” tree.
Tick the “M AXI GP0 interface” checkbox to enable it.
Now click “OK” to close the Re-customize IP window.
You should now see a new input port on the left side of the Zynq PS block. This is the AXI clock input. We must now connect the FCLK_CLK0 output to the AXI clock input. To do this, click on the FCLK_CLK0 output and then click on the M_AXI_GP0_ACLK input. This will trace a wire between the pins and make the connection.

Create the HDL wrapper

Now the Zynq is setup and all we need to do to create a functional project is to create a HDL wrapper for the design.

Open the “Sources” tab from the Block Design window.
Right-click on “design_1” and select “Create HDL wrapper” from the drop-down menu.
From the “Create HDL wrapper” window, select “Let Vivado manage wrapper and auto-update”. Click “OK”.

From this point, we have a base design containing the Zynq PS from which we could generate a bitstream and test on the MicroZed. We haven’t exploited any of the FPGA fabric, but the Zynq PS is already connected to the Gigabit Ethernet PHY, the USB PHY, the SD card, the UART port and the GPIO, all thanks to the Block Automation feature. So there is already quite a lot we could do with the design at this point, such as running Linux on the PS or running a bare metal application on it.

Generate the bitstream

To generate the bitstream, click “Generate Bitstream” in the Flow Navigator.

Once the bitstream is generated, the following window appears. Select “Open Implemented Design” and click “OK”.

The implemented design will open in Vivado showing you a map of the Zynq device and how the design has been placed. In our case, we haven’t used any of the FPGA fabric (only the PS), so the map is empty for the most part.

Export the hardware to SDK

Once the bitstream has been generated, the hardware design is done and we’re ready to develop the code to run on the processor. This part of the design process is done in Xilinx Software Development Kit (SDK), so from Vivado we must first export the project to SDK.

In Vivado, from the File menu, select “Export->Export hardware”.
In the window that appears, tick “Include bitstream” and click “OK”.
Again from the File menu, select “Launch SDK”.
In the window that appears, use the following settings and click “OK”.

At this point, the SDK loads and a hardware platform specification will be created for your design. You should be able to see the hardware specification in the Project Explorer of SDK as shown in the image below.

You are now ready to create a software application to run on the PS.

Create a Software application

At this point, your SDK window should look somewhat like this:

To demonstrate creating an application for the Zynq, we’ll create a hello world application that will send “hello world” out the UART and to our PC.

From the File menu, select New->Application Project.
In the first page of the New Project wizard, choose a name for the application. I’ve chosen “hello_world”. Click “Next”.
On the templates page, select the “Hello World” template and click “Finish”.
The SDK will generate a new application which you should find in the Project Explorer as in the image below.

The “hello_world” folder contains the software application, which you can browse and modify. I suggest you take a look at the code that is contained here and get familiar with it.

The “hello_world_bsp” folder contains the Board Support Package, which is a bunch of libraries that are provided for accessing the various peripherals and features of the Zynq. In general, it’s better not to modify this code because it gets written over every time you use the “Clean project” option. Instead, if you want to make customizations, try to keep it in the application folder.

Once the software application has been created, it is automatically built. Once the application is built, we are ready to test the design on hardware.

Test the design on the hardware

To test the design, we are using the MicroZed board from Avnet. Make the following setup before continuing:

On the MicroZed, set the JP1, JP2 and JP3 jumpers all to the 1-2 position.
Connect the USB-UART (J2) to a USB port of your PC.
Connect a Platform Cable USB II programmer (or similar device) to the JTAG connector. Connect the programmer to a USB port of your PC.

Now you need to open up a terminal program on your PC and set it up to receive the “hello world” message. I use Miniterm because I’m a Python fan, but you could use any other terminal program such as Putty. Use the following settings:

Comport – check your device manager to find out what comport the MicroZed popped up as. In my case, it was COM12 as shown below.
Baud rate: 115200bps
Data: 8 bits
Parity: None
Stop bits: 1

Now that your PC is ready to receive the “hello world” message, we are ready to send our bitstream and software application to the hardware.

In the SDK, from the menu, select Xilinx Tools->Program FPGA.
In the Program FPGA window, we select the hardware platform to program. We have only one hardware platform, so click “Program”.
The bitstream will be loaded onto the Zynq and we are ready to load the software application. Select the “hello_world” folder in the Project Explorer, then from the menu, select Run->Run.
In the Run As window, select “Launch on Hardware (GDB)” and click “OK”.
The application will be loaded on the Zynq PS and it will be executed. Look out for the “Hello World” message in your terminal window!

What now?

In the following tutorials I’ll go through more concepts such as:

Creating a custom IP block and integrating your own HDL code
Using the DMA engine for transferring data between IP and memory
Running Linux on the PS
Accessing IP through a Linux application

Source code

The TCL build script and source code for this project is shared on Github here:

https://github.com/fpgadeveloper/microzed-base

For instructions on rebuilding the project from sources, read my post on version control for Vivado projects.

↧

Version control for Vivado projects

August 1, 2014, 12:39 pm

≫ Next: Creating a custom IP block in Vivado

≪ Previous: Creating a Base System for the Zynq in Vivado

Vivado generates a whole bunch of files when you create a project, and it’s not very clear on which are source files and which are generated files. The best approach is to consider them all to be generated files and to put none of them in version control. Instead, create a folder stucture for your sources that makes sense to you and use Tcl scripts to build the project and import the sources.

Vivado was designed to be completely Tcl driven. When you work in the GUI, you’ll probably notice that everything you do launches a Tcl command in the Tcl console at the bottom of the screen. This was a really nice design choice by Xilinx because it allows us to control the tools from scripts, thus allowing automation and proper version control.

Now just because we use scripts for good version control, doesn’t mean we have to give up the GUI. Vivado lets you generate a project from a script, work on it from the GUI and then save the project to a Tcl script form. It sounds simple, and it is, there are just a few things to understand first.

Example folder structure

Determine the folder/file structure that you want to use for the version controlled files. Here is the folder/file structure that I use for the designs that I put on Github (folders are in bold font):

Vivado
- ip_repo
- src
  - bd
    - design_1.tcl
  - hdl
- build.tcl
- build.bat

I call the top level folder Vivado to group together all the source files related to the Vivado project (some of my project repositories also have a folder for Python applications, EDK projects, etc). The “build.tcl” file is a Tcl script that will build the Vivado project from the sources. The “build.bat” file is a batch file that launches the build script. The “src” folder contains all the version controlled sources, such as VHDL and Verilog code, as well as scripts for generating parts of the project. The “bd” folder contains a script that generates the block design and is called from the “build.tcl” script. The “ip_repo” folder is generated by Vivado and contains version controlled sources for IP blocks used in the design.

How to generate a build script for a Vivado project

A template script for building a Vivado project can be generated from the GUI.

Assuming you’ve created a project using the GUI – from the File menu, select ‘Write Project Tcl’.
Choose a name and location for the output Tcl script file. I generally use the name ‘build.tcl’ and locate it in the project folder.

Now modify the build script!

The template script that is generated by Vivado serves as a good example, however it makes a few assumptions that don’t fit with the typical way you’d use a version controlled repository:

The script assumes you want to create the project in the working directory (ie. the folder from which Vivado was run) and not the folder containing the build script. We’d typically want to check-out the sources in a new folder, run the build script and have the project generated in the new folder.
The script assumes that you’ve version controlled the HDL wrapper and the block design (.bd) file, AND that they will be found at the same locations that Vivado put them in the original project. This doesn’t work if you want to regenerate the project in the folder containing the build script, because if your repository contains these files, located exactly as they were in the original project, the script fails, saying that the project files already exist. To make things worse, when it opens the block design (.bd) file, it modifies the date in the file, thus bringing it out of sync with version control when no design modification was actually made.

So the way around the first assumption is to replace these lines:

# Set the reference directory for source file relative paths
set origin_dir "."

# Set the directory path for the original project from where this script was exported
set orig_proj_dir "[file normalize "$origin_dir/orig-project"]"

# Create project
create_project myproject ./myproject

With these:

# Set the reference directory to where the script is
set origin_dir [file dirname [info script]]

# Create project
create_project myproject $origin_dir/myproject

The way around the second assumption is to generate the block design and the wrapper from a script.

Fortunately, we don’t have to write the script to generate the block design, Vivado can make the script for us. To generate the block design script in Vivado, with the block design open, select File->Export->Export block design.

Save this file in the “src/bd” folder and commit it to version control.

Now we can modify the “build.tcl” script to call the block design script and generate the HDL wrapper. At the end of the file, add the following lines:

# Create block design
 source $origin_dir/src/bd/design_1.tcl

 # Generate the wrapper
 set design_name [get_bd_designs]
 make_wrapper -files [get_files $design_name.bd] -top -import

Finally, you’ll have to remove the lines that import the HDL wrapper file and the block design (.bd) file into the project.

What files to commit to version control

In general, don’t commit anything within the project sub-folder that was created by Vivado. You want to keep ALL controlled sources including scripts out of that folder.

Commit “build.tcl” (discussed above) to version control.
Commit “src/bd/design_1.bd” (discussed above) to version control.
If you added any other sources to the design such as VHDL or Verilog files, make sure they are located in the “src/hdl” folder and commit these as well.
If you created custom IP, you can commit all the files in the IP repository (ip_repo). The ip_repo folder should be at the same level as your “src” folder.

How to rebuild a Vivado project from the build script and sources

These steps assume that you’ve cloned the project repository to a new location (otherwise Vivado will try to regenerate the project files that already exist and it’ll produce an error message).

My preferred method is to run a batch file called “build.bat” that I commit to version control in the same location as the “build.tcl” file. The batch file should contain the following line:

C:\Xilinx\Vivado\2014.2\bin\vivado.bat -mode batch -source build.tcl

So from Windows Explorer I only need to double click on that batch file and Vivado generates the project files. Then I open the project in Vivado by double clicking on the generated .xpr file.

Just for completeness, here is one way to call the script from Vivado:

From the welcome screen in Vivado, select Window->Tcl Console.
The Tcl console should open up at the bottom of the welcome screen.
In the Tcl console, where it says ‘Type a Tcl command here’, type the command ‘cd <location-of-your-project>’ and press ENTER to change the working directory to that of your project (the location of the build.tcl file).
Then in the Tcl console, type the command ‘source build.tcl’ and press ENTER to execute the build script.

Vivado will then regenerate the project files and open them in the GUI.

Dealing with modifications

When you make modifications to the project using the GUI, always remember to save them by using the “Tools->Write Project Tcl” and “File->Export->Export block design” options.

Need an example?

For an example of how to version control a Vivado project, checkout this base system project that I’ve shared on Github:

https://github.com/fpgadeveloper/microzed-base

↧

Creating a custom IP block in Vivado

August 4, 2014, 8:00 am

≫ Next: Using the AXI DMA in Vivado

≪ Previous: Version control for Vivado projects

Tutorial Overview

In this tutorial we’ll create a custom AXI IP block in Vivado and modify its functionality by integrating custom VHDL code. We’ll be using the Zynq SoC and the MicroZed as a hardware platform. For simplicity, our custom IP will be a multiplier which our processor will be able to access through register reads and writes over an AXI bus.

The multiplier takes in two 16 bit unsigned inputs and outputs one 32 bit unsigned output. A single 32 bit write to the IP will contain the two 16 bit inputs, separated by the lower and higher 16 bits. A single 32 bit read from the peripheral will contain the result from the multiplication of the two 16 bit inputs. The design doesn’t serve much purpose, but it is a good example of integrating your own code into an AXI IP block.

Requirements

Before following this tutorial, you will need to do the following:

Vivado 2014.2
MicroZed
Platform Cable USB II (or equivalent JTAG programmer)

Start from a base project

You can do this tutorial with any existing Vivado project, but I’ll start with the base system project for the MicroZed that you can access here:

Base system project for the MicroZed

Create the Custom IP

With the base Vivado project opened, from the menu select Tools->Create and package IP.
The Create and Package IP wizard opens. If you are used to the ISE/EDK tools you can think of this as being similar to the Create/Import Peripheral wizard. Click “Next”.
On the next page, select “Create a new AXI4 peripheral”. Click “Next”.
Now you can give the peripheral an appropriate name, description and location. Click “Next”.
On the next page we can configure the AXI bus interface. For the multiplier we’ll use AXI lite, and it’ll be a slave to the PS, so we’ll stick with the default values.
On the last page, select “Edit IP” and click “Finish”. Another Vivado window will open which will allow you to modify the peripheral that we just created.

At this point, the peripheral that has been generated by Vivado is an AXI lite slave that contains 4 x 32 bit read/write registers. We want to add our multiplier code to the IP and modify it so that one of the registers connects to the multiplier inputs and another to the multiplier output.

Add the multiplier code to the peripheral

You can find the multiplier code on Github at the link below. Download the “multiplier.vhd” file and save it somewhere, the location is not important for now.

https://github.com/fpgadeveloper/microzed-custom-ip/blob/master/ip_repo/my_multiplier_1.0/src/multiplier.vhd

Note that these steps must be done in the Vivado window that contains the peripheral we just created (not the base project).

From the Flow Navigator, click “Add Sources”.
In the window that appears, select “Add or Create Design Sources” and click “Next”.
On the next window, click “Add Files”.
Browse to the “multiplier.vhd” file, select it and click “OK”.
Make sure you tick “Copy sources into IP directory” and then click “Finish”.

The multiplier code is now added to the peripheral, however we still have to instantiate it and connect it to the registers.

Modify the Peripheral

At this point, your Project Manager Sources window should look like the following:

Open the branch “my_multiplier_v1_0 – arch_imp”.
Double click on the “my_multiplier_v1_0_S00_AXI_inst” file to open it.
The source file should be open in Vivado. Find the line with the “begin” keyword and add the following code just above it to declare the multiplier and the output signal:

signal multiplier_out : std_logic_vector(31 downto 0);
 
component multiplier
port (
  clk: in std_logic;
  a: in std_logic_VECTOR(15 downto 0);
  b: in std_logic_VECTOR(15 downto 0);
  p: out std_logic_VECTOR(31 downto 0));
end component;

Now find the line that says “– Add user logic here” and add the following code below it to instantiate the multiplier:

multiplier_0 : multiplier
port map (
  clk => S_AXI_ACLK,
  a => slv_reg0(31 downto 16),
  b => slv_reg0(15 downto 0),
  p => multiplier_out);

Find this line of code “reg_data_out <= slv_reg1;” and replace it with “reg_data_out <= multiplier_out;”.
In the process statement just a few lines above, replace “slv_reg1” with “multiplier_out”.
Save the file.
You should notice that the “multiplier.vhd” file has been integrated into the hierarchy because we have instantiated it from within the peripheral.
Click on “IP File Groups” in the Package IP tab of the Project Manager.
Click the “Merge changes from IP File Group Wizard” link.
The “IP File Groups” should now have a tick.
Now click “Review and Package IP”.
Now click “Re-package IP”.

The peripheral will be packaged and the Vivado window for the peripheral should be automatically closed. We should now be able to find our IP in the IP catalog. Now the rest of this tutorial will be done from the original Vivado window.

Add the IP to the design

Click the “Add IP” icon.
Find the “my_multiplier” IP and double click it.
The block should appear in the block diagram and you should see the message “Designer Assistance available. Run Connection Automation”. Click the connection automation link.
Click the “my_multiplier_0” peripheral from the drop-down menu.
In the window that appears, set Clock connection to “Auto” and click “OK”.
The new block diagram should look like this:
Generate the bitstream.
When the bitstream is generated, select “Open the implemented design” and click “OK”.

Export the hardware design to SDK

Once the bitstream has been generated, we can export our design to SDK where we can then write code for the PS. The PS is going to write data to our multiplier and read back the result.

In Vivado, from the File menu, select “Export->Export hardware”.
In the window that appears, tick “Include bitstream” and click “OK”.
Again from the File menu, select “Launch SDK”.
In the window that appears, use the following settings and click “OK”.

You are now ready to create a software application to run on the PS.

Create a Software application

At this point, your SDK window should look somewhat like this:

To make things easy for us, we’ll use the template for the hello world application and then modify it to test the multiplier.

From the File menu, select New->Application Project.
In the first page of the New Project wizard, choose a name for the application. I’ve chosen “hello_world”. Click “Next”.
On the templates page, select the “Hello World” template and click “Finish”.
The SDK will generate a new application which you should find in the Project Explorer as in the image below.

The “hello_world” folder contains the Hello World software application, which we will modify to test our multiplier.

Modify the Software Application

Now all we need to do is modify the software application to test our multiplier peripheral.

From the Project Explorer, open the “hello_world/src” folder. Open the “helloworld.c” source file.
Replace all the code in this file with the following.

#include "platform.h"
#include "xbasic_types.h"
#include "xparameters.h"

Xuint32 *baseaddr_p = (Xuint32 *)XPAR_MY_MULTIPLIER_0_S00_AXI_BASEADDR;

int main()
{
init_platform();

xil_printf("Multiplier Test\n\r");

// Write multiplier inputs to register 0
*(baseaddr_p+0) = 0x00020003;
xil_printf("Wrote: 0x%08x \n\r", *(baseaddr_p+0));

// Read multiplier output from register 1
xil_printf("Read : 0x%08x \n\r", *(baseaddr_p+1));

xil_printf("End of test\n\n\r");

return 0;
}

Save and close the file.

Test the design on the hardware

To test the design, we are using the MicroZed board from Avnet. Make the following setup before continuing:

On the MicroZed, set the JP1, JP2 and JP3 jumpers all to the 1-2 position.
Connect the USB-UART (J2) to a USB port of your PC.
Connect a Platform Cable USB II programmer (or similar device) to the JTAG connector. Connect the programmer to a USB port of your PC.

Now you need to open up a terminal program on your PC and set it up to receive the test messages. I use Miniterm because I’m a Python fan, but you could use any other terminal program such as Putty. Use the following settings:

Comport – check your device manager to find out what comport the MicroZed popped up as. In my case, it was COM12 as shown below.
Baud rate: 115200bps
Data: 8 bits
Parity: None
Stop bits: 1

Now that your PC is ready to receive the test messages, we are ready to send our bitstream and software application to the hardware.

In the SDK, from the menu, select Xilinx Tools->Program FPGA.
In the Program FPGA window, we select the hardware platform to program. We have only one hardware platform, so click “Program”.
The bitstream will be loaded onto the Zynq and we are ready to load the software application. Select the “hello_world” folder in the Project Explorer, then from the menu, select Run->Run.
In the Run As window, select “Launch on Hardware (GDB)” and click “OK”.
The application will be loaded on the Zynq PS and it will be executed. Look out for the results in your terminal window!

We’re sending two 16-bit inputs 0×02 and 0×03 and the result is 0×06 as expected.

Source code

The TCL build script and source code for this project is shared on Github here:

https://github.com/fpgadeveloper/microzed-custom-ip

For instructions on rebuilding the project from sources, read my post on version control for Vivado projects.

↧

Using the AXI DMA in Vivado

August 6, 2014, 11:31 am

≫ Next: Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design

≪ Previous: Creating a custom IP block in Vivado

In a previous tutorial I went through how to use the AXI DMA Engine in EDK, now I’ll show you how to use the AXI DMA in Vivado. We’ll create the hardware design in Vivado, then write a software application in the Xilinx SDK and test it on the MicroZed board (source code is shared on Github for the MicroZed and the ZedBoard, see links at the bottom).

What is DMA?

DMA stands for Direct Memory Access and a DMA engine allows you to transfer data from one part of your system to another. The simplest usage of a DMA would be to transfer data from one part of the memory to another, however a DMA engine can be used to transfer data from any data producer (eg. an ADC) to a memory, or from a memory to any data consumer (eg. a DAC).

Tutorial overview

In this design, we’ll use the DMA to transfer data from memory to an IP block and back to the memory. In principle, the IP block could be any kind of data producer/consumer such as an ADC/DAC FMC, but in this tutorial we will use a simple FIFO to create a loopback. After, you’ll be able to break the loop and insert whatever custom IP you like.

The block diagram above illustrates the design that we’ll create. The processor and DDR memory controller are contained within the Zynq PS. The AXI DMA and AXI Data FIFO are implemented in the Zynq PL. The AXI-lite bus allows the processor to communicate with the AXI DMA to setup, initiate and monitor data transfers. The AXI_MM2S and AXI_S2MM are memory-mapped AXI4 buses and provide the DMA access to the DDR memory. The AXIS_MM2S and AXIS_S2MM are AXI4-streaming buses, which source and sink a continuous stream of data, without addresses.

Notes:

MM2S stands for Memory-Mapped to Streaming, whereas S2MM stands for Streaming to Memory-Mapped.
When Scatter-Gather is used, there is an extra AXI bus between the DMA and the memory controller. It was left out of the diagram for simplicity.

Requirements

Before following this tutorial, you will need to do the following:

Vivado 2014.2
MicroZed
Platform Cable USB II (or equivalent JTAG programmer)

Start from the base project

We’ll start this tutorial with the base system project for the MicroZed that you can access here:

Base system project for the MicroZed

Add the AXI DMA

Open the base project in Vivado.
In the Flow Navigator, click ‘Open Block Design’.
The block diagram should open and you should only have the Zynq PS in the design.
Click the ‘Add IP’ icon and double click ‘AXI Direct Memory Access’ from the catalog.

Connect the Memory-mapped AXI buses

The DMA block should appear and designer assistance should be available. Click the ‘Run Connection Automation’ link and select ‘/axi_dma_0/S_AXI_LITE’ from the drop-down menu.
Click ‘OK’ in the window that appears. Vivado will connect the AXI-lite bus of the DMA to the General Purpose AXI Interconnect of the PS.
Your block diagram should now look like this :
Now we need to connect AXI buses M_AXI_SG, M_AXI_MM2S and M_AXI_S2MM of the DMA to a high performance AXI slave interface on the PS. Our PS doesn’t seem to have a high-performance AXI slave interface, so we need to change the Zynq configuration to enable one. Double click on the Zynq block.
Select ‘PS-PL Configuration’, open the ‘HP Slave AXI Interface’ branch and tick the ‘S AXI HP0 interface’ to enable it. Then click OK.
The high-performance AXI slave ports should now be visible in the block diagram, and designer assistance should be available. Click the ‘Run Connection Automation’ link and select ‘/processing_system7_0/S_AXI_HP0’ from the drop-down menu.
In the window that appears, make sure that Vivado intends to connect it to the DMA and click OK.
Designer assistance should again be available, click the ‘Run Connection Automation’ link and select ‘/axi_dma_0/M_AXI_SG’ from the drop-down menu.
In the window that appears, click OK.
Designer assistance should still be available, click the ‘Run Connection Automation’ link and select ‘/axi_dma_0/M_AXI_S2MM’ from the drop-down menu.
In the window that appears, click OK.

Now all the memory-mapped AXI buses are connected to the DMA. Now we only have to connect the AXI streaming buses to our loopback FIFO and connect the DMA interrupts.

Add the FIFO

Click the ‘Add IP’ icon and double click ‘AXI4-Stream Data FIFO’ from the catalog.
The FIFO should be visible in the block diagram. Now we must connect the AXI-streaming buses to those of the DMA. Click the ‘S_AXIS’ port on the FIFO and connect it to the ‘M_AXIS_MM2S’ port of the DMA.
Then connect the ‘M_AXIS’ port on the FIFO and connect it to the ‘S_AXIS_S2MM’ port of the DMA.
Now we must connect the FIFO clock and reset. Click the ‘s_axis_aresetn’ port of the FIFO and connect it to the ‘axi_resetn’ port of the DMA.
Click the ‘s_axis_aclk’ port of the FIFO and connect it to the ‘s_axi_lite_aclk’ port of the DMA.

Remove the AXI-Streaming status and control ports of the DMA

In our design, we won’t need the AXI-Streaming status and control ports which are used to transmit extra information alongside the data stream. You might use them if you were connecting to the AXI Ethernet core or a custom IP that made use of them.

In the block diagram, double click the AXI DMA block.
Un-tick the ‘Enable Control / Status Stream’ option and click OK.

Connect the DMA interrupts to the PS

Our software application will test the DMA in polling mode, but to be able to use it in interrupt mode, we need to connect the interrupts ‘mm2s_introut’ and ‘s2mm_introut’ to the Zynq PS.

First we have to enable interrupts from the PL. Double click the Zynq block and select the Interrupts tab.
Tick ‘Fabric Interrupts’ and ‘IRQ_F2P[15 :0]’ to enable them, and click OK.
Click the ‘Add IP’ icon and double-click ‘Concat’ from the catalog.
Connect the ‘dout’ port of the Concat to the ‘IRQ_F2P’ port of the Zynq PS.
Connect the ‘mm2s_introut’ port of the DMA to the ‘In0’ port of the Concat.
Connect the ‘s2mm_introut’ port of the DMA to the ‘In1’ port of the Concat.

Validate and build the design

From the menu select Tools->Validate Design.
You should get this message saying that validation was successful.
We can clean up the block diagram by clicking the Regenerate Layout icon.
Our block diagram now looks like this :
In the Flow Navigator, click ‘Generate Bitstream’.

Export the hardware design to SDK

Once the bitstream has been generated, we can export our design to SDK where we can develop the software application that will setup a DMA transfer, wait for completion and then verify the loopback.

In Vivado, from the File menu, select “Export->Export hardware”.
In the window that appears, tick “Include bitstream” and click “OK”.
Again from the File menu, select “Launch SDK”.
In the window that appears, use the following settings and click “OK”.

We are now ready to create the software application.

Create a Software application

At this point, your SDK window should look somewhat like this:

To make things easy for us, we’ll use the template for the hello world application and then modify it to test the AXI DMA.

From the File menu, select New->Application Project.
In the first page of the New Project wizard, choose a name for the application. I’ve chosen “hello_world”. Click “Next”.
On the templates page, select the “Hello World” template and click “Finish”.
The SDK will generate a new application which you should find in the Project Explorer as in the image below.

The “hello_world” folder contains the Hello World software application, which we will modify to test our AXI DMA.

Modify the Software Application

We need to modify the hello world software application to test our DMA.

From the Project Explorer, open the “hello_world/src” folder. Open the “helloworld.c” source file.
Replace all the code in this file with the code that you will find on Github here: https://github.com/fpgadeveloper/microzed-axi-dma/blob/master/SDK/hello_world/src/helloworld.c
Save and close the file. The application should build automatically.

The application source code is derived from an example provided by Xilinx in the installation files. You can find it at this location on your PC:

C:\Xilinx\14.7\ISE_DS\EDK\sw\XilinxProcessorIPLib\drivers\axidma_v7_02_a\examples\xaxidma_example_sg_poll.c

By the way, if you didn’t know about it already, that folder contains heaps of examples that you will find useful, I suggest you check it out.

Test the design on the hardware

To test the design, we are using the MicroZed board from Avnet. Make the following setup before continuing:

On the MicroZed, set the JP1, JP2 and JP3 jumpers all to the 1-2 position.
Connect the USB-UART (J2) to a USB port of your PC.
Connect a Platform Cable USB II programmer (or similar device) to the JTAG connector. Connect the programmer to a USB port of your PC.

Comport – check your device manager to find out what comport the MicroZed popped up as. In my case, it was COM12 as shown below.
Baud rate: 115200bps
Data: 8 bits
Parity: None
Stop bits: 1

Now that your PC is ready to receive the test messages, we are ready to send our bitstream and software application to the hardware.

In the SDK, from the menu, select Xilinx Tools->Program FPGA.
In the Program FPGA window, we select the hardware platform to program. We have only one hardware platform, so click “Program”.
The bitstream will be loaded onto the Zynq and we are ready to load the software application. Select the “hello_world” folder in the Project Explorer, then from the menu, select Run->Run.
In the Run As window, select “Launch on Hardware (GDB)” and click “OK”.
The application will be loaded on the Zynq PS and it will be executed. Look out for the results in your terminal window!

Source code

The TCL build script and source code for this project is shared on Github at the following links:

For the MicroZed: https://github.com/fpgadeveloper/microzed-axi-dma
For the ZedBoard: https://github.com/fpgadeveloper/zedboard-axi-dma

For instructions on rebuilding the project from sources, read my post on version control for Vivado projects.

↧

Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design

December 8, 2015, 12:06 pm

≫ Next: FPGA Network tap: Designing the Ethernet pass-through

≪ Previous: Using the AXI DMA in Vivado

Tutorial Overview

In this two-part tutorial, we’re going to create a multi-port Ethernet design in Vivado 2015.4 using both the GMII-to-RGMII and AXI Ethernet Subsystem IP cores. We’ll then test the design on hardware by running an echo server on lwIP. Our target hardware will be the ZedBoard armed with an Ethernet FMC, which adds 4 additional Gigabit Ethernet ports to our platform. Ports 0 to 2 of the Ethernet FMC will connect to separate AXI Ethernet Subsystem IPs which will be configured in DMA mode. Port 3 of the Ethernet FMC will connect to GEM1 of the Zynq PS through the GMII-to-RGMII IP, while the on-board Ethernet port of the ZedBoard will connect to GEM0.

Requirements

To go through this tutorial, you’ll need the following:

Vivado 2015.4 (see note below)
ZedBoard
Ethernet FMC (standard or robust model will work)
Platform Cable USB II (or equivalent JTAG programmer)

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

Change Vivado’s default language

Before creating our project, we need to make sure that Vivado is configured to use VHDL as it’s default language. We wont be writing any HDL code, however the constraints that we use will be dependent on the project language being set to VHDL, so it’s important that we set this:

Open Vivado.
From the menu, select Tools->Options.
In the “General” tab select target language : VHDL.

Create a new Vivado project

Follow these steps to create a new project in Vivado:

From the welcome screen, click “Create New Project”.
Specify a folder for the project. I’ve created a folder named “zedboard_qgige”. Click “Next”.
For the Project Type window, choose “RTL Project” and tick “Do not specify sources at this time”. Click “Next”.
For the Default Part window, select the “Boards” tab and then select the “ZedBoard Zynq Evaluation and Development Kit” and click “Next”.
Click “Finish” to complete the new project wizard.

Setup the Zynq PS

We start off the design by adding the Zynq PS (aka. Processor System) and make the connections specified by the ZedBoard board definition file which is included with Vivado 2015.4.

From the Vivado Flow Navigator, click “Create Block Design”.
Specify a name for the block design. Let’s go with the default “design_1” and leave it local to the project. Click “OK”.
In the Block Design Diagram, you will see a message that says “This design is empty. Press the (Add IP) button to add IP.”. Click on the “Add IP” icon either in the message, or in the vertical toolbar.
The IP catalog will appear. Go to the end of the list and double click on “ZYNQ7 Processing System” – it should be the second last on the list.
In the Block Design Diagram, you will see a message that says “Designer Assistance available. Run Block Automation”. Click on the “Run Block Automation” link.
Block Automation uses the board definition file for the ZedBoard to make connections and pin assignments to external hardware such as the DDR and the on-board Ethernet port. Just make sure that “Apply Board Preset” is ticked and click OK.
Now our block diagram has changed and we can see that the DDR and FIXED_IO are connected externally. We can now configure the Zynq PS for our specific needs. Double click on the Zynq PS block to open the Re-customize IP window.
From the Page Navigator, select “Clock Configuration” and open the “PL Fabric Clocks” tree. Notice that “FCLK_CLK0” is enabled by default and set to 100MHz, this will serve as the clock for our AXI interfaces. Now enable “FCLK_CLK1” and “FCLK_CLK2” and set them to 125MHz and 200MHz respectively. The FCLK_CLK1 (125MHz) will be needed by the AXI Ethernet Subsystem blocks and it will be used to clock the RGMII interfaces. FCLK_CLK2 (200MHz) will be required by both the GMII-to-RGMII and AXI Ethernet Subsystem IPs and it is needed to clock the IDELAY_CTRLs.
Now from the Page Navigator, select “PS-PL Configuration”. By default the Master AXI GP0 interface should be enabled as you can see in the image. You must also enable the High Performance Slave AXI HP0 interface as shown. The HP Slave AXI Interface provides a high-bandwidth connection to the DDR3 memory controller – this will be needed by the DMA engines which we will create after we add the AXI Ethernet Subsystem blocks to our design.
The last thing to do is to enable interrupts. From the Page Navigator, select “Interrupts” and tick to enable “Fabric Interrupts” then “IRQ_F2P[15:0]”. Interrupts will be generated by all the Ethernet IPs and by the DMA engine IPs.
Now click “OK” to close the Re-customize IP window.
You will notice that the PS block has gotten a bit bigger and it has more ports. Connect FCLK_CLK0 (100MHz) to the GP Master AXI clock input (M_AXI_GP0_ACLK) by dragging a trace from one pin to the other.This action will draw a wire between the pins and make the connection.
Also connect the FCLK_CLK0 to the HP Slave AXI Interface clock input (S_AXI_HP0_ACLK).
Now open the IP Catalog and add 3 x AXI 1G/2.5G Ethernet Subsystem IPs to the design (you will have to add one at a time). Once you have done this, you should have three AXI Ethernet Subsystem blocks in your design: “axi_ethernet_0”, “axi_ethernet_1” and “axi_ethernet_2”.
To wire the AXI Ethernet Subsystem blocks in DMA mode, we’ll use the block automation feature, however before running this, we want to configure the “shared logic” option of the cores first. The AXI Ethernet Subsystem IP is designed with the option to include “shared logic” in the core. The shared logic includes an IDELAY_CTRL to control the IODELAYs on the RGMII interface, as well as an MMCM to generate a 90 degree skewed clock for generation of the RGMII TX clock. When we use multiple AXI Ethernet Subsystem blocks in the one design, we can save on resources by having only one of those cores include the “shared logic”. The core containing the “shared logic” will naturally share the IDELAY_CTRL with the other cores, and it will have outputs for the clocks generated by the MMCM so that it can share them too. Let’s make “axi_ethernet_0” be the one that contains the shared logic, so double click on it to bring up the Re-customize IP window.
Go to the “Shared Logic” tab (don’t worry about any of the other options for now). Select the “Include Shared Logic in Core” option and click OK.
Now open the Re-customize IP window for “axi_ethernet_1”, go to the “Shared Logic” tab and select the “Include Shared Logic in IP Example Design” option and click OK. Do the same for “axi_ethernet_2”.
Now we can wire up the Ethernet blocks by using the block automation feature. Notice there is a message in your block diagram saying “Designer Assistance available. Run Block Automation”. Click on the “Run Block Automation” link.
In the “Run Block Automation” window, you will have automation options for each of the Ethernet blocks. Tick to enable all of them, then select them one by one and make sure that they are each configured for an “RGMII” physical interface and a “DMA” connection to the AXI Streaming Interfaces. By default they will all be configured for GMII so it is important to set the physical interface correctly here and for each one of them. Then click OK.
After the block automation has run its course, you will notice that it has added a Clocking Wizard block called “axi_ethernet_0_refclk”. This block generates a 125MHz and 200MHz clock to feed the Ethernet blocks, however we will be using the Zynq PS to generate those clocks, so we don’t need this block. Click once on the “axi_ethernet_0_refclk” block and press Delete to remove it from the block diagram.
We can now use the Connection Automation feature to wire up our AXI interfaces. Click “Run Connection Automation” from the block diagram.
Like before, tick to enable ALL of the connections. We then we need to select each of the interfaces one-by-one and choose the right settings. Luckily the defaults are good for us in Vivado 2015.4, but check the screenshots below if you are using a different version. Then click OK.
When the automation feature has run its course, you will notice that again you have the option to “Run Connection Automation”. Maybe this will not be the case in future versions of Vivado, but it is the case for 2015.4. So again click “Run Connection Automation”, tick to enable all the interfaces for automation and make sure the settings are correct (defaults are good for 2015.4). They should all be configured to connect to the HP Slave AXI Interface (S_AXI_HP0) and use the “Auto” clock connection.
Now it’s time to add the GMII-to-RGMII block for Port 3 of the Ethernet FMC. Open the IP Catalog and double click on “Gmii to Rgmii”.
Double click on the GMII-to-RGMII block to open the Re-customize IP window.
In the “Core Functionality” tab, tick “Instantiate IDELAYCTRL in design”, set the PHY Address to 8 and select the option “Skew added by PHY”. Notice that the GMII-to-RGMII core has an MDIO input and an MDIO output. Why does the MDIO bus have to pass through the core? That’s because the GMII-to-RGMII core has logic that sits on the MDIO bus to receive commands from the MAC for configuration. The PHY address we specify here allows us to give the core a unique address on the MDIO bus, and it is very important that the address be different to that of the external PHY. On the Ethernet FMC, all the PHYs are configured with address 0, so we can give the GMII-to-RGMII core an address of 8 without creating a bus conflict. As for the “Skew added by the PHY” option, this concerns the RGMII transmit clock. Some PHYs, including the 88E1510 on the Ethernet FMC, have a feature to add a delay to the incoming RGMII TX clock so that it aligns well for sampling the incoming RGMII transmit data. The GMII-to-RGMII core allows us to specify where the skew is added: in the PHY or in the FPGA fabric (MMCM). The skew should be added by one or the other, never both, or the clock will be poorly aligned for sampling the data and the RGMII interface will fail.
Now open the “Shared Logic” tab and select “Include Shared Logic in Core”.
To connect the GMII-to-RGMII core to the PS, we need to enable GEM1 in the PS. Double click on the Zynq PS block and select “MIO Configuration” in the Page Navigator. Tick to enable “ENET 1” and select “EMIO” (Extended Multiplexed Input/Output). Selecting EMIO allows us to route GEM1 through to the FPGA fabric, so that we can then connect it to our GMII-to-RGMII core and then out to the Ethernet FMC.
Now you should see two extra ports on the Zynq PS block: “GMII_ETHERNET_1” and “MDIO_ETHERNET_1”. Make a connection between “MDIO_GEM” of the GMII-to-RGMII block and “MDIO_ETHERNET_1” of the Zynq PS.
Now make a connection between “GMII” of the GMII-to-RGMII block and “GMII_ETHERNET_1” of the Zynq PS.
Now we need to make the “MDIO_PHY” and “RGMII” ports external so that they can connect to the Ethernet FMC. Right click on each of these interfaces and select “Make External”.
Now find the external ports on the right hand side of the block diagram. We’ll need to rename them so that the names fit with the constraints that we will later add to the project. Click first on the “MDIO_PHY” port and rename it “mdio_io_port_3”. Then click on the “RGMII” port and rename it “rgmii_port_3”. The port name can be changed in the “External Interface Properties” window that normally sits just below the “Design” window and to the left of the block diagram (see image below).
Once you have changed the names, your ports should now look like this.
The GMII-to-RGMII block doesn’t provide us with a reset signal for the PHY, so we have to add some logic to provide that signal. Open the IP Catalog and add a “Utility Reduced Logic” IP.
Double click on the “util_reduced_logic_0” block and set the “C Size” to 1 and the “C Operation” to “and”. Then click OK.
Now connect the input of the “util_reduced_logic_0” block to the active-low peripheral reset output of the Processor System Reset block.
Now right click on the output of the “util_reduced_logic_0” block and select “Make External”.
The external port will have been named “Res” by default. Change this name to “reset_port_3” so that it matches the constraints we will later add to the project.
The MDIO, RGMII and reset ports of the 3 x AXI Ethernet Subsystem blocks will have already been externalized during the automation process, however they will have been given odd names so we need to change those names to match the constraints that we will later add to the project. Go through the ports one-by-one and rename them as follows:
1. axi_ethernet_0 should have its external ports named “mdio_io_port_0”, “rgmii_port_0” and “reset_port_0”.
2. axi_ethernet_1 should have its external ports named “mdio_io_port_1”, “rgmii_port_1” and “reset_port_1”.
3. axi_ethernet_2 should have its external ports named “mdio_io_port_2”, “rgmii_port_2” and “reset_port_2”.
Now open the IP Catalog and add a Concat (concatenate) IP to the design. The concatenate IP takes a series of single inputs and concatenates them into a vector output. We will need this IP to be able to connect all the interrupts to the IRQ_F2P[0:0] vector input of the PS.
Double click on the Concat block and set the number of ports to 12. Then click OK.
Now we must connect all the interrupts to the Concat IP. One-by-one, go through and make all the following connections. Note that the order of pin assignment is not important because it will all be transferred to the SDK in the hardware description and be correctly mapped by the BSP.
1. Connect axi_ethernet_0_dma/mm2s_introut to xlconcat_0/In0
2. Connect axi_ethernet_0_dma/s2mm_introut to xlconcat_0/In1
3. Connect axi_ethernet_1_dma/mm2s_introut to xlconcat_0/In2
4. Connect axi_ethernet_1_dma/s2mm_introut to xlconcat_0/In3
5. Connect axi_ethernet_2_dma/mm2s_introut to xlconcat_0/In4
6. Connect axi_ethernet_2_dma/s2mm_introut to xlconcat_0/In5
7. Connect axi_ethernet_0/mac_irq to xlconcat_0/In6
8. Connect axi_ethernet_0/interrupt to xlconcat_0/In7
9. Connect axi_ethernet_1/mac_irq to xlconcat_0/In8
10. Connect axi_ethernet_1/interrupt to xlconcat_0/In9
11. Connect axi_ethernet_2/mac_irq to xlconcat_0/In10
12. Connect axi_ethernet_2/interrupt to xlconcat_0/In11
Now connect the Concat output to the IRQ_F2P input of the Zynq PS.
Now let’s connect FCLK_CLK2, the 200MHz clock, to the Ethernet blocks. First connect FCLK_CLK2 to the “ref_clk” pin of “axi_ethernet_0”.
Then connect FCLK_CLK2 to the “clkin” pin of the GMII-to-RGMII block. Now for those who are curious, you probably noticed that the GMII-to-RGMII block only has one clock input (clkin). This input must be connected to 200MHz which will be used to clock the IDELAY_CTRL, but also it will be used to generate three other clocks: 125MHz, 25MHz and 2.5MHz which are used for link speeds 1Gbps, 100Mbps, 10Mbps respectively. The actual link speed is determined by the PHY during the autonegotiation process and it is up to the processor to read the link speed from the PHY and then pass this value onto the GMII-to-RGMII core so that it uses the appropriate clock. By default, it is set to use the 2.5MHz clock for a link speed of 10Mbps.
Now let’s connect the “tx_reset” and “rx_reset” ports to the peripheral reset signal. Remember to connect BOTH of them (one at a time).
Now we need to connect the 125MHz clock to the “gtx_clk” port of the AXI Ethernet Subsystem block “axi_ethernet_0” (the one containing the shared logic). We did enable FCLK_CLK1 for this purpose, and you can make that connection if you wish, but for this tutorial we will explore another possibility. The Ethernet FMC has an on-board 125MHz oscillator which can also be used to supply “gtx_clk”. In order to use it, we just need to add a differential buffer to our design. Open the IP Catalog and add a “Utility Buffer” to the design.
Connect the output of the buffer to the “gtx_clk” input of “axi_ethernet_0”.
Now click on the plus (+) symbol on the input of the buffer to show the differential inputs.
Now right-click on each of the individual inputs of the buffer and select “Make External”. You should end up with two external input ports named “IBUF_DS_P[0:0]” and “IBUF_DS_N[0:0]”.
Rename those external input ports to “ref_clk_p” and “ref_clk_n” respectively.
Now there is only one more thing to add. The Ethernet FMC has two inputs that are used to enable the on-board 125MHz oscillator and to select it’s frequency (it can alternatively be set to 250MHz). We need to add some constants to our design to enable the oscillator and to set it’s output frequency to 125MHz. Open the IP Catalog and add two Constant IPs.
By default, they will be set to constant outputs of 1, which is exactly what we need. So all we must do is make their outputs external and rename them to “ref_clk_oe” and “ref_clk_fsel”. The result should be as shown in the image below.
Save the block diagram by clicking “File->Save Block Design”.

Create the HDL wrapper

Our Vivado block diagram is complete and we now need to create a HDL wrapper for the design.

Open the “Sources” tab from the Block Design window.
Right-click on “design_1” and select “Create HDL wrapper” from the drop-down menu.
From the “Create HDL wrapper” window, select “Let Vivado manage wrapper and auto-update”. Click “OK”.

Add the constraints file

The last thing we need to add to our project will be the constraints. The constraints file contains:

Pin assignments for all the external ports of our block design, which in our case are the pins that are routed to the FMC connector and through to our Ethernet FMC
A definition for the 125MHz reference clock that comes in from the Ethernet FMC
IODELAY grouping constraints to assign each port to one of two groups corresponding to the I/O bank that it occupies. We don’t want the tools trying to group all the ports to the same IDELAY_CTRL, but rather there should be one instantiated for each I/O bank – in our case, there is one instantiated in the “axi_ethernet_0” and another in “gmii_to_rgmii_0”.

Follow these steps to add the constraints file to your project:

Download the constraints file from this link: Constraints for ZedBoard and Ethernet FMC using GMII-to-RGMII and AXI Ethernet
Save the constraints file somewhere on your hard disk.
From the Project Manager, click “Add Sources”.
Then click “Add or create constraints”.
Then click “Add files” and browse to the constraints file that you downloaded earlier.
Tick “Copy constraints files into project” and click Finish.
You should now see the constraints file in the Sources window.

Sources Git repository

Sources for re-generating this project automatically can be found on Github at the links below. There is a version of the project for the ZedBoard and the MicroZed. There is also a version that uses only the AXI Ethernet Subsystem IP.

Instructions for re-generating those projects can be found in this post: Version control for Vivado projects. We will also discuss that in the following tutorial, as well as testing the projects on actual hardware.

Testing the project on hardware

In the second part of this tutorial we will generate the bitstream for this project, export it to the SDK and then test an echo server application on the hardware. The echo server application runs on lwIP (light-weight IP), the open source TCP/IP stack for embedded systems.

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

FPGA Network tap: Designing the Ethernet pass-through

December 29, 2015, 8:20 am

≫ Next: Running a lwIP Echo Server on a Multi-port Ethernet design

≪ Previous: Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design

When designing a network tap on an FPGA, the logical place to start is the pass-through between two Ethernet ports. In this article, I’ll discuss a convenient way to connect two Ethernet ports at the PHY-MAC interface, which will form the basis of a network tap. The pass-through will be designed in Vivado for the ZedBoard combined with an Ethernet FMC. In future articles, I’ll discuss other aspects of an FPGA network tap design, including monitor ports, packet filtering, and opportunities for hardware acceleration in the FPGA.

Pass-through at the MAC interface (GMII, RGMII or SGMII)

The criteria for an ideal pass-through are:

it must be completely transparent to all devices communicating over the link,
it must preserve the fidelity of the link, and ideally,
it should add very little latency to the link.

From those criteria we could suppose that if we could simply tap the wires of the Ethernet cable, we’d have our ideal tap. Unfortunately, due to the complexity of Gigabit Ethernet signals, we can’t do that, instead we have to break the link and connect each end to it’s own Ethernet PHY. The pass-through is implemented on the other end of the PHYs, or the MAC interface which is typically one of the following standards: GMII, RGMII or SGMII. In the case of the Ethernet FMC, which uses 4x Marvell 88E1510 Ethernet PHYs, we’re dealing with the RGMII interface.

RGMII signals are double-data-rate (DDR) and so in order to bring the data into our FPGA fabric and send it back out, we need to use the IDDR and ODDR primitives. Fortunately, there is an IP that implements the RGMII interface for us and provides us with a single-data-rate interface which we can use for the pass-through and for “tapping”. The GMII-to-RGMII IP core, included with Vivado, converts an RGMII interface, to a GMII interface. To implement our pass-through, all we have to do is instantiate two GMII-to-RGMII converters, route them to two separate Ethernet PHYs and loop together the two GMII interfaces.

fpga_network_tap_4

The block diagram above illustrates the general idea. Port 0 and port 1 of the Ethernet FMC are each connected to a GMII-to-RGMII converter, and the GMII interfaces are passed through to the opposite port.

Use FIFOs to connect the GMII interfaces

When connecting one GMII interface to another, you will notice that the transmit interface has a separate clock to the receive interface. The GMII TX data, TX enable and TX error signals are all synchronous to the TX clock, whereas the GMII RX data, RX valid and RX error signals are all synchronous to the RX clock. So you can’t directly connect the GMII transmit interface to the GMII receive interface – you have to use proper clock domain crossing. The easy way to do that is by using a FIFO with independent read and write clocks – you’ll need two of them, one for each direction of data flow.

fpga_network_tap_2

Wire the FIFOs as elastic buffers

The natural way to connect FIFOs to the transmit and receive interfaces is to use the “rx_dv” (RX valid) output of the GMII interfaces to drive the “write enable” inputs of the FIFOs, and to use the “valid” output of the FIFOs to drive the “tx_en” (TX enable) inputs of the GMII interface. However, in our application, there is a problem with this method. If even momentarily the FIFO is being read slightly faster than it is being written to, you will have occasions where the FIFO is empty for one clock cycle and forced to de-assert the “valid” signal. This is a problem because the GMII interface “enable” and “valid” signals are only supposed to be de-asserted at the end of a packet, so this gap effectively terminates the Ethernet packet that you are feeding to the PHY. The better solution is to feed the “enable” and “valid” signals through the FIFOs, and to design the FIFOs as elastic buffers. Remember that once you decide that a FIFO will be written to and read from constantly, using two independent clocks, it must be designed as an elastic buffer or you risk losing data due to the FIFO reaching the full or empty state. In the elastic buffer solution, we still use our “tx_en” and “rx_dv” signals, but we use them to determine what data the elastic buffer can discard at the write interface (when it’s too full), as well as when the elastic buffer can momentarily halt the read interface (when it’s too empty). An elastic buffer is not perfect and it relies on a certain amount of redundancy being present in the data, but in typical Ethernet applications, there is enough time between packets that the job of designing a reliable elastic buffer is quite simple.

So when you want to wire up a FIFO as a simple elastic buffer, there are two things to setup:

1. Programmable full and empty outputs

These signals will tell us when the FIFOs are too full or too empty and they allow us to keep the FIFO occupancy within a certain range. Typically that “range” is centered at the mid-point of the FIFO, for example, if our FIFO contains 1000 words, then we could set our desired occupancy to be between 400 and 600. In this case, the programmable full output would be set to 600, and the programmable empty output would be set to 400.

2. Write enable and read enable logic

The write and read enable inputs must be connected to logic functions that will throttle the FIFO, filling it up when it gets too empty and emptying it when it gets too full. The functions are:

write enable <= NOT prog_full OR rx_valid
read enable <= NOT prog_empty OR tx_valid

fpga_network_tap_3

Configuring the GMII-to-RGMII converter

For the GMII-to-RGMII converter to operate properly, we have to let it know the actual link speed that was setup by the PHY during auto-negotiation. But how do we communicate this information to the core?

You may have noticed that the GMII-to-RGMII core contains two MDIO ports, one of which is normally connected to the MAC, and the other which is normally externalized and connected to the PHY. The GMII-to-RGMII core “sits” on the MDIO bus, as though it were another PHY, and it can be configured over that MDIO bus. So we communicate the link speed information to the core over the MDIO bus and the typical sequence is as follows:

Trigger the auto-negotiation sequence in the PHY (optional)
We read the actual link speed from the PHY after auto-negotiation has completed
We write the actual link speed to the GMII-to-RGMII core

The last step involves writing to a specific register within the GMII-to-RGMII core with a value that corresponds to the link speed. To do this we need the address of the register to write to (0x10) and the “PHY” address of the GMII-to-RGMII core (I quote the word PHY because the core is not a PHY). The “PHY” address of the GMII-to-RGMII core is specified in Vivado, and is 8 by default. In order to communicate with two GMII-to-RGMII cores in our design, we have connected one of the MDIO “inputs” to GEM1 of the Zynq PS. We then connected the MDIO “output” to the MDIO “input” of the second GMII-to-RGMII converter (see block diagram above). This way, we can configure both GMII-to-RGMII converters using only the MDIO port of GEM1. In Vivado, we configure the GMII-to-RGMII cores to have different “PHY addresses”, specifically 7 and 8, so that we don’t create a bus conflict.

fpga_network_tap_5

Depending on the established link speed, we need to write the following values to register 0x10 of both of the GMII-to-RGMII converters:

For a link speed of 1Gbps, we need to write 0x140.
For a link speed of 100Mbps, we need to write 0x2100.
For a link speed of 10Mbps, we need to write 0x100.

For reliable operation, the link on Port 0 should be the same speed as that on Port 1, ie. don’t try to use this pass-through to connect networks of different speeds.

Sources Git repository

The sources for re-generating this project automatically can be found on Github at the link below.

Ethernet FMC Network Tap Github Source Code

If you want to better understand how the sources are organized, you can read this post: Version control for Vivado projects.

Next on the FPGA network tap

In the next post on the FPGA network tap, we’ll hook up the other two ports of the Ethernet FMC as monitor ports which will enable “listening” by a third device. Port 2 will send a copy all packets going in one direction, while port 3 will send a copy of all packets going in the other direction, so the result will be a full gigabit network tap. We’ll also hook the ports up to soft TEMAC IPs and look at filtering the packets within the FPGA fabric.

↧

Running a lwIP Echo Server on a Multi-port Ethernet design

January 5, 2016, 7:08 am

≫ Next: Microblaze PCI Express Root Complex design in Vivado

≪ Previous: FPGA Network tap: Designing the Ethernet pass-through

Tutorial Overview

This tutorial is the follow-up to Using AXI Ethernet Subsystem and GMII-to-RGMII in a Multi-port Ethernet design. In this part of the tutorial we will generate the bitstream, export the hardware description to the SDK and then test the echo server application on our hardware. The echo server application runs on lwIP (light-weight IP), the open source TCP/IP stack for embedded systems. Our hardware platform is the Avnet ZedBoard combined with the Ethernet FMC.

Regenerate the Vivado project

Firstly, for those of you who did not follow the first part of this tutorial, we will use the scripts in the Git repository for this project to regenerate the Vivado project. If you followed the first part of the tutorial correctly, you should not need to complete this step. Please note that the Git repository is regularly updated for the latest version of Vivado, so you must download the last “commit” for the version of Vivado that you are using.

Download the sources from Github here: https://github.com/fpgadeveloper/zedboard-qgige
Depending on your operating system:
- If you are using a Windows machine, open Windows Explorer, browse to the “Vivado” folder within the sources you just downloaded. Double-click on the “build.bat” file to run the batch file.
- If you are using a Linux machine, run Vivado and then select Window->Tcl Console from the welcome screen. In the Tcl console, use the “cd” command to navigate to the “Vivado” folder within the sources you just downloaded. Then type “source build.tcl” to run the build script.
Once the script has finished running, the Vivado project should be regenerated and located in the “Vivado” folder. Run Vivado and open the newly generated project.

If you did not follow the first part of this tutorial, you may want to open the block diagram and get familiar with the design before continuing.

Generate the bitstream

When you are ready to generate the bitstream, click “Generate Bitstream” in the Flow Navigator.

Once the bitstream is generated, the following window will appear. Select “View Reports” and click “OK”.

zedboard_echo_server_5

Export the hardware to SDK

When the bitstream has been generated, we can export it and the hardware description to the Software Development Kit (SDK). In the SDK we will be able to generate the echo server example design and run it on our hardware.

In Vivado, from the File menu, select “Export->Export hardware”.
In the window that appears, tick “Include bitstream”, select Export to “Local to Project”, and click “OK”.
From the File menu, select “Launch SDK”.
In the window that appears, you need to specify the location of the hardware description and the location of the SDK workspace. We specified earlier to generate the hardware description local to the project (including bistream), so the Exported location must be “Local to Project”. By preference, we choose to create the SDK workspace local to the project, but you can change this if you wish. Click “OK”.

At this point, the SDK loads and a hardware platform specification will be created for your design.

Create the Echo Server application

At this point, your SDK workspace should contain only the hardware description and no applications:

zedboard_echo_server_14

We add the echo server application by selecting New->Application Project from the File menu.

zedboard_echo_server_15

In the New Project wizard, we want to name the application appropriately, so type “echo_server” as the project name then click “Next”.

zedboard_echo_server_16

The next page allows you to create the new application based on a template. Select the “lwIP Echo Server” template and click “Finish”.

zedboard_echo_server_17

The SDK will generate a new application called “echo_server” and a Board Support Package (BSP) called “echo_server_bsp”, both of which you will find in the Project Explorer as shown below.

zedboard_echo_server_18

By default, the SDK is configured to build the application automatically.

Modify the application

The echo server template application will be setup to run on the first AXI Ethernet Subsystem block in our design. This corresponds to PORT0 of the Ethernet FMC. We want to add some code to the application to allow us to select a different port if we choose.

Open the “main.c” file from the echo_server source folder.
After the last “#include” statement, add the following code:

#include "xlwipconfig.h"

/* Set the following DEFINE to the port number (0,1,2 or 3)
* of the Ethernet FMC that you want to hook up
* to the lwIP echo server. Only one port can be connected
* to it in this version of the code.
*/
#define ETH_FMC_PORT 0

/*
* NOTE: When using ports 0..2 the BSP setting "use_axieth_on_zynq"
* must be set to 1. When using port 3, it must be set to 0.
* To change BSP settings: right click on the BSP and click
* "Board Support Package Settings" from the context menu.
*/
#ifdef XLWIP_CONFIG_INCLUDE_AXIETH_ON_ZYNQ
#if ETH_FMC_PORT == 0
#define EMAC_BASEADDR XPAR_AXIETHERNET_0_BASEADDR  // Eth FMC Port 0
#endif
#if ETH_FMC_PORT == 1
#define EMAC_BASEADDR XPAR_AXIETHERNET_1_BASEADDR  // Eth FMC Port 1
#endif
#if ETH_FMC_PORT == 2
#define EMAC_BASEADDR XPAR_AXIETHERNET_2_BASEADDR  // Eth FMC Port 2
#endif
#else /* XLWIP_CONFIG_INCLUDE_AXIETH_ON_ZYNQ is not defined */
#if ETH_FMC_PORT == 3
#define EMAC_BASEADDR XPAR_XEMACPS_1_BASEADDR  // Eth FMC Port 3
#endif
#endif

3. Then go down to where the define PLATFORM_EMAC_BASEADDR is used, and replace it with EMAC_BASEADDR.

When you save the “main.c” file, the SDK should automatically start rebuilding the application.

Modify the Libraries

The BSP for this project will also have to be modified slightly, at least for Vivado 2015.4 and older versions. There are a few reasons for these modifications, but we would be going off-track to discuss those reasons in detail at this point. The modifications that apply to you will be found in the “README.md” file of the sources that you downloaded earlier. If you are using the latest version of Vivado, you can simply refer to the instructions on the front page of the Git repository.

I strongly recommend that you perform these modifications to the sources in the Vivado installation files – not the sources in the BSP of your SDK workspace. The reason is that the BSP sources will be written-over with the original sources every time that you re-build the BSP – so you’re better off modifying them at the true source.

Note: These modifications are specific to using the echo server application on the Ethernet FMC. If you are not using the Ethernet FMC, you may not need to make these modifications and you’re better off leaving the library sources as they are.

Setup the hardware

To setup our hardware, we need to configure the ZedBoard for configuration by JTAG, we need to set the VADJ voltage to the appropriate value and we need to correctly attach the Ethernet FMC. Follow these instructions to ensure that your setup is correct:

On the ZedBoard, set the JP7, JP8, JP9, JP10 and JP11 jumpers all to the SIG-GND position. This sets it for configuration by JTAG.
Set the VADJ select jumper (J18) to either 1.8V or 2.5V, depending on the version of Ethernet FMC that you are using. We are using the 2.5V version.
Connect the Ethernet FMC to the FMC connector of the ZedBoard. Apply pressure only to the area above and below the connector – you should feel the two connectors “snap” together.
Now we need to use two screws to fix the Ethernet FMC to the ZedBoard – you should find two M2.5 x 4mm screws included with the ZedBoard. Turn the ZedBoard upside down and use a Phillips head screwdriver to fix the Ethernet FMC to the ZedBoard. Please do not neglect this step, it is very important and will protect your hardware from being damaged in the event that the Ethernet FMC hinges and becomes loose. The FMC connector is not designed to be the only mechanical fixation between the carrier and mezzanine card, the screws are necessary for mechanical and electrical integrity.
Turn the ZedBoard around so that it is sitting the right way up.
Connect the USB-UART (J14) to a USB port of your PC.
Connect a Platform Cable USB II programmer (or similar device) to the JTAG connector. Connect the programmer to a USB port of your PC. Alternatively, if you don’t have a programmer, you can connect a USB cable to the J17 connector of the ZedBoard.
Connect PORT0 of the Ethernet FMC to a gigabit Ethernet port of your PC.
Now plug the ZedBoard power adapter into a wall socket and then into the ZedBoard.
Switch ON the power to the board. You should see the “POWER” LED on the ZedBoard turn on.

Test the Echo Server on hardware

To be able to read the output of the echo server application, we need to use a terminal program such as Putty. Use the following settings:

Comport – check your device manager to find out what comport the ZedBoard popped up as. In my case, it was COM16 as shown below.
Baud rate: 115200bps
Data: 8 bits
Parity: None
Stop bits: 1

With the terminal program open, we can now load our ZedBoard with the bitstream and then run the echo server application.

In the SDK, from the menu, select Xilinx Tools->Program FPGA.
In the Program FPGA window, we select the hardware platform to program. We have only one hardware platform, so click “Program”.
The bitstream will be loaded onto the Zynq and we are ready to load the software application. Select the “echo_server” folder in the Project Explorer, then from the menu, select Run->Run.
In the Run As window, select “Launch on Hardware (GDB)” and click “OK”.
The application will be loaded on the Zynq PS and it will be executed. The terminal window should display this output from the echo server:

The output indicates that:

The PHY auto-negotiation sequence has completed
The auto-negotiated link-speed is 1Gbps
The DHCP timeout was reached, indicating that the application was not able to get an IP address from a DHCP server
The auto-assigned IP address is 192.168.1.10

Now that the application is running successfully, we can test the echo server by sending packets from our PC to the ZedBoard and looking at what gets sent back.

Ping Test

All Ethernet devices are required to respond to ping requests, so this is a very simple and easy test to perform using your computer. Just open a command window and type “ping 192.168.1.10”.

zedboard_echo_server_26

Packet Echoing

To test that the echo server is actually doing its job and echoing received packets, you will have to install software that allows you to send and receive arbitrary packets. The software that I use is called Packet Sender and can be downloaded here. Once the software is installed, follow the instructions below to send and receive packets:

Run Packet Sender.
Create a new packet to send using these parameters and then click “Save”:
- Name: Test packet
- ASCII: follow the white rabbit
- IP Address: 192.168.1.10
- Port: 7
- Resend: 0
The packet details will be saved in the Packets tab, and we can now click on the “Send” button to send that packet whenever we want. Click “Send” and see what happens.
If everything went well, the Traffic Log tab should display two packets: one sent by our computer and one received by our computer. They should both occur almost instantaneously, so if you only see one, you’ve probably got a problem with your setup.

If you want to experiment, you can play around with the software by sending more packets, or different kinds of packets.

Changing ports

So far we’ve been using PORT0 of the Ethernet FMC to test the echo server, but suppose we wanted to use one of the other ports 1,2 or 3. You can configure the port on which to run lwIP by setting the ETH_FMC_PORT define that we added earlier to the main.c file of the SDK application. Valid values for ETH_FMC_PORT are 0,1,2 or 3.

One other thing to be aware of is the BSP setting called “use_axieth_on_zynq”. This parameter specifies whether the BSP will be used with AXI Ethernet Subsystem or with something else: Zynq GEM, Ethernet lite, etc. Remember that in our Vivado design we connected ports 0, 1 and 2 to an AXI Ethernet Subsystem block, and we connected port 3 to the GEM1 of the Zynq PS. Therefore, when selecting the port on which you wish to run lwIP, remember to correctly set the “use_axieth_on_zynq” parameter:

When using ports 0..2 the BSP setting “use_axieth_on_zynq” must be set to 1.
When using port 3, the BSP setting “use_axieth_on_zynq” must be set to 0.

The application will not compile if the correct BSP settings have not been set. To change BSP settings: right click on the BSP and click Board Support Package Settings from the context menu.

What now?

The echo server application is actually a very good starting place for developing Ethernet applications on the ZedBoard or other Xilinx FPGAs. Here are some potential ways you could “tweek” the echo server application to be useful for other things:

Allow your FPGA designs to be controlled by TCP commands sent from a PC over an Ethernet cable – or over the Internet.
Send data over TCP from your PC to your FPGA and leverage the FPGA for hardware acceleration.
Connect your FPGA to the Internet and design a high-performance IoT device.

Source code Git repository

Below are the links to the source code Git repositories. There is a version of the project for the ZedBoard and the MicroZed. There is also a version that uses only the AXI Ethernet Subsystem IP.

If you enjoyed this tutorial or if you run into problems using it, please leave me a comment below.

↧

Microblaze PCI Express Root Complex design in Vivado

April 13, 2016, 7:00 am

≫ Next: Zynq PCI Express Root Complex design in Vivado

≪ Previous: Running a lwIP Echo Server on a Multi-port Ethernet design

This is the first part of a three part tutorial series in which we will go through the steps to create a PCI Express Root Complex design in Vivado, with the goal of being able to connect a PCIe end-point to our FPGA. We will test the design on hardware by connecting a PCIe NVMe solid-state drive to our FPGA using the FPGA Drive adapter.

Part 1: Microblaze PCI Express Root Complex design in Vivado (this tutorial)

Part 2: Zynq PCI Express Root Complex design in Vivado

Part 3: Connecting an SSD to an FPGA running PetaLinux

In the first part of this tutorial series we will build a Microblaze based design targeting the KC705 Evaluation Board. In the second part, we will build a Zynq based design targeting the PicoZed 7Z030 and PicoZed FMC Carrier Card V2. In part 3, we will test the design on the target hardware using a stand-alone application that will validate the state of the PCIe link and perform enumeration of the PCIe end-points. We will then run PetaLinux on the FPGA and prepare our SSD for use under the operating system.

Requirements

To complete this tutorial you will need the following:

Vivado 2015.4
KC705 Evaluation Board
FPGA Drive adapter
An NVMe PCIe solid-state drive such as this one

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

The Components

The image below gives us a high level view of the design showing each component and how it connects to the Microblaze – only the AXI-Lite interfaces are shown.

microblaze_pcie_root_complex_vivado_93

Let’s talk about the role of each peripheral in the design:

AXI Interrupt Controller – connects to the interrupts generated by the peripherals and routes them through to the Microblaze. It’s generally a good idea to connect all interrupts to the Microblaze when you plan to run PetaLinux.
AXI Central DMA – performs data transfers from one memory mapped space to another. We have the CDMA in this design to be able to make fast data transfers between the PCIe end-point and the DDR3 memory.
AXI Memory Mapped to PCI Express – performs address mapping between the AXI address space and the PCIe address space. It contains the integrated PCI Express block and all the logic required to translate PCIe TLPs into AXI memory mapped reads and writes. The AXI-PCIe block has a slave interface (S_AXI) to allow an AXI master (such as the Microblaze) to access the PCIe address space, and it also has a master interface (M_AXI) which allows a PCIe end-point to access the AXI address space.
AXI UART16550 – UART for console output, which is needed by our stand-alone software application and by PetaLinux.
AXI EthernetLite – provides a 10/100Mbps network connection for PetaLinux.
AXI Quad SPI – provides access to a QSPI Flash device which can be used for storing software, the Linux kernel or FPGA configuration files.
AXI Timer – provides an accurate timer needed by PetaLinux.

The Address Spaces

The image below shows the AXI memory mapped interface connections which is useful for understanding the memory spaces and the devices that have access to them.

microblaze_pcie_root_complex_vivado_94

The important thing is to understand is who the bus masters are and what address spaces they can access – the connections could have been made in a number of different ways to achieve the same goal.

The 2 address spaces are:

the DDR3 memory accessed through the MIG, and
the PCIe address space accessed through the S_AXI interface of the AXI-PCIe bridge

The 3 AXI masters and the address spaces they can access are:

the Microblaze can access both the DDR3 memory and the PCIe address space
the PCIe end-point with bus mastering capability can access the DDR3 memory only (via M_AXI port of the AXI-PCIe bridge)
the CDMA can access both the DDR3 memory and the PCIe address space

About PCIe end-point bus mastering

Most PCIe end-points have bus mastering capability. Basically this means that the PCIe end-point can send memory read/write TLPs to the root complex and read/write to a part of the system memory that was allocated for the end-point. Maybe the most common application of end-point bus mastering is the implementation of Message Signaled Interrupts (or MSI). When a PCIe end-point generates an MSI, it simply writes to part of the system memory that was allocated by the root complex.

Create a new Vivado project

We start by creating a new project in Vivado and selecting the KC705 Evaluation board as our target.

From the welcome screen, click “Create New Project”.
Specify a folder for the project. I’ve created a folder named “kc705_aximm_pcie”. Click “Next”.
For the Project Type window, choose “RTL Project” and tick “Do not specify sources at this time”. Click “Next”.
For the Default Part window, select the “Boards” tab and then select the “Kintex-7 KC705 Evaluation Platform” and click “Next”.
Click “Finish” to complete the new project wizard.

Create the block design

Now we need to create and build our block design. We will start by adding the Microblaze and the AXI Memory Mapped PCI Express Bridge.

From the Vivado Flow Navigator, click “Create Block Design”.
Specify a name for the block design. Let’s go with the default “design_1” and leave it local to the project. Click “OK”.
In the Block Design Diagram, you will see a message that says “This design is empty. Press the (Add IP) button to add IP.”. Click on the “Add IP” icon either in the message, or in the vertical toolbar.
The IP catalog will appear. Find and double click on “Microblaze”.
You will see the Microblaze in the block diagram. Double click on it to open the configuration wizard.
The Microblaze has several predefined configurations that can be selected on the first page of the Microblaze Configuration Wizard. We eventually want to run PetaLinux on the Microblaze, so we need to select “Linux with MMU” to get the best configuration for that. Then click “OK” to accept that configuration.
The AXI-PCIe block is going to provide the clock source for most of our design, including the Microblaze. By adding it to our block design at this point, we will then be able to use the Block Automation feature to setup a lot of the required hardware, saving us a lot of time. Find the “AXI Memory Mapped to PCI Express Bridge IP” in the IP Catalog and double click on it to add it to the block diagram.
Now click on “Run Block Automation” which will help us to setup the Microblaze local memory, the Microblaze MDM, the Processor System Reset and the AXI Interrupt Controller.
In the Run Block Automation window, apply the settings shown in the image below. Set the Local Memory to 128KB. Set the Cache Configuration to 16KB. Tick the Interrupt Controller checkbox. Set the Clock Connection to “/axi_pcie_0/axi_aclk_out”. Then click OK.
The block diagram should now look like the image below. Notice that everything so far is driven by the “axi_aclk_out” clock which is driven by the AXI-PCIe block. The reset signals are generated by the Processor System Reset block, which will synchronize the external PCIe reset signal (PERST_N) to the “axi_aclk_out” clock.
Right click on the “ext_reset_in” pin of the Processor System Reset block, and select “Make External”.
Click on the port that was just created (called “ext_reset_in”) and change it’s name to “perst_n” using the “External Port Properties” window.

Add the MIG

Now let’s add the DDR3 memory to the design. Find the “Memory Interface Generator (MIG 7 series)” in the IP Catalog and double click it to add it to the block diagram.
Click “Run Block Automation” to setup the external connections to the MIG.
In the Run Block Automation window, click “OK”.
The connection automation feature can save us a lot of time setting up the MIG, but if we run it now, Vivado will connect it to the Microblaze through the AXI Interconnect that is already in the design (microblaze_0_axi_periph). There’s nothing particularly wrong with that, but in this design we want to have a separate AXI Interconnect for the MIG so that we can more easily control which blocks have access to the DDR3 and which have access to the peripherals. It’s a point to consider in this design because we will have a PCIe end-point with bus mastering capabilities, and we need to limit what the end-point will have access to. Find “AXI Interconnect” in the IP Catalog and double click on it to add one to the design.
Click on the AXI Interconnect block and rename it to “mem_intercon” using the “Sub-block properties” window.
Double click on the “mem_intercon” block and configure it for 4 slave interfaces, and 1 master interface.
Connect the master interface (M00_AXI) of “mem_intercon” to the slave interface (S_AXI) of the MIG.
Now we can run the connection automation feature. Click “Run Connection Automation”. Select ONLY the “microblaze_0/M_AXI_DC”, “microblaze_0/M_AXI_IC” and “mig_7series_0/sys_rst” connections. Click “OK”.
Connect the master interface (M_AXI) of “axi_pcie_0” to the slave interface (S02_AXI) of the “mem_intercon”. This provides a data path from the PCIe end-point to the DDR3 memory. Note that the PCIe end-point will not be able to access anything else in our design.
Connect the “aresetn” input of the MIG to the “peripheral_aresetn” output of the “rst_mig_7series_0_100M” Processor System Reset block. Note that this Processor System Reset was generated when we used the connection automation feature in the steps above.
As shown in the image below, connect the “S02_ACLK” and “S03_ACLK” clock inputs of the “mem_intercon” to the “axi_aclk_out” output of the AXI-PCIe block. Also connect the “S02_ARESETN” and “S03_ARESETN” inputs to the “peripheral_aresetn” of the “rst_axi_pcie_0_62M” Processor System Reset.

Configure the AXI Memory Mapped to PCI Express Bridge

Double click on the AXI-PCIe block so that we can configure it. On the “PCIE:Basics” tab of the configuration, select “KC705 REVC” as the Xilinx Development Board, and select “Root Port of PCI Express Root Complex” as the port type.
On the “PCIE:Link Config” tab, select a “Lane Width” of 4x and a “Link speed” of 5 GT/s (Gen2). Note that the KC705 has 8 lanes routed to the PCIe edge-connector, however the PCIe SSD that we want to connect with has only 4 lanes.
In the “PCIE:ID” tab, enter a “Class Code” of 0x060400. This is important for the last part of this tutorial series, in which we will be running PetaLinux. The class code will ensure that the correct driver is associated with the AXI to PCIe bridge IP.
In the “PCIE:BARS” tab, tick “Hide RP BAR”, tick “BAR 64-bit Enabled” and set BAR 0 with type “Memory” and a size of 4 Gigabytes. In this configuration, the PCIe end-point is given access to the entire 32-bit address space – remember though that it’s only physically connected to the DDR3 memory.
In the “PCIE:Misc” tab, use the defaults as shown in the image below.
In the “AXI:BARS” tab, use the defaults as shown in the image below. We will later be able to configure the size of the AXI BAR 0 in the Address Editor.
In the “AXI:System” tab, use the defaults as shown in the image below.
In the “Shared Logic” tab, use the defaults as shown in the image below. Click “OK”.
Right click on the “pcie_7x_mgt” port of the AXI-PCIe block and select “Make External”. This will connect the gigabit transceivers to the 4 PCIe lanes on the PCIe edge-connector of the KC705.
Connect the “mmcm_lock” output of the AXI-PCIe block to the “dcm_locked” input of “rst_axi_pcie_0_62M” Processor System Reset block.
Connect the “axi_aresetn” input of the AXI-PCIe block to the “perst_n” port.
Add a “Constant” from the IP Catalog and configure it to output 0 (low). We’ll use this to tie low the “INTX_MSI_Request” input of the AXI-PCIe block. Connect the constant’s output to the “INTX_MSI_Request” input of the AXI-PCIe block.
Add a “Utility Buffer” to the block design. This buffer is going to be connected to a 100MHz clock that will be provided to the KC705 board by the FPGA Drive adapter, via the PCIe edge-connector. A 100MHz reference clock is required by all PCIe devices.
Double click on the utility buffer and on the “Page 0” tab of the configuration window, select “IBUFDSGTE” as the C Buf Type. Click “OK”.
Connect the “IBUF_OUT” output of the utility buffer to the “REFCLK” input of the AXI-PCIe block.
Right click on the “CLK_IN_D” input of the utility buffer and select “Make External”.
Change the name of the created external port to “ref_clk” using the External Interface Properties window.
We need to connect the PCIe interrupt to the Microblaze. Connect the “interrupt_out” output of the AXI-PCIe block to the “In0” input of the interrupt concat “microblaze_0_xlconcat”.

Add the CDMA

Now we’ll add a Central DMA to this design which will allow us to setup data transfers between the PCIe end-point and the DDR3 memory. We won’t actually test the CDMA in this tutorial series, but it’s an important part of any PCIe design that needs to transfer large amounts of data very quickly over the PCIe link. We will add an AXI Interconnect to allow the CDMA to access both the PCIe end-point and the MIG.

Add a “AXI Central Direct Memory Access” from the IP Catalog to the block design.
Double click on the CDMA block to open the configuration window. Disable Scatter Gather and set “Write/Read Data Width” to 128 as shown in the image below.
Connect the “cdma_introut” output of the CDMA to the “In1” input of the interrupt concat “microblaze_0_xlconcat”.
Add an “AXI Interconnect” from the IP Catalog to the block design. Rename it to “cdma_intercon” using the “Sub-block Properties” window.
Connect the “M_AXI” interface of the CDMA to the “S00_AXI” interface of the “cdma_intercon”.
Connect the “M00_AXI” interface of the “cdma_intercon” to the “S03_AXI” interface of the “mem_intercon”. This provides the data path between the CDMA and the DDR3 memory.
Now connect all the clocks and resets of the “cdma_intercon” as shown in the image below. Connect all the clock inputs to the “axi_aclk_out” output of the AXI-PCIe block. Connect the “ARESETN” input to the “interconnect_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset. Connect all other reset inputs to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset.
Double click on the “microblaze_0_axi_periph” interconnect and configure it for 7 master ports. Leave the number of slave ports as 1.
Connect the “M01_AXI” interface of the “microblaze_0_axi_periph” interconnect to the “S_AXI_LITE” interface of the CDMA.
Connect the “m_axi_aclk” input of the CDMA to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “s_axi_lite_aclk” input of the CDMA to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “s_axi_lite_aresetn” input of the CDMA to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset block.
Connect the “M01_ACLK” input of the “microblaze_0_axi_periph” to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “M01_ARESETN” input of the “microblaze_0_axi_periph” to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset block.

Connect the AXI PCIe slave interfaces

The AXI PCIe block has one slave interface for configuration (S_AXI_CTL) and another for accessing the PCIe end-point (S_AXI). The slave interface for configuration must be driven synchronous to the “axi_ctl_aclk_out” clock, so before connecting the slave interfaces, we first need to create a Processor System Reset to generate a reset signal that is synchronous to this clock.

Add a “Processor System Reset” from the IP Catalog.
Connect the “axi_ctl_aclk_out” clock output of the AXI-PCIe block to the “slowest_sync_clk” input of the Processor System Reset just added.
Connect the “ext_reset_in” input of the Processor System Reset to the “perst_n” port.
Connect the “dcm_locked” input of the Processor System Reset to the “mmcm_lock” output of the AXI-PCIe block.
Now the Processor System Reset is setup and we can connect the AXI-PCIe block slave control interface. We want the control interface to be connected to the Microblaze, just like any other peripheral. Connect the “M02_AXI” interface of the “microblaze_0_axi_periph” interconnect to the “S_AXI_CTL” interface of the AXI-PCIe block.
Connect the “M02_ACLK” input of the “microblaze_0_axi_periph” interconnect to the “axi_ctl_aclk_out” output of the AXI-PCIe block.
Connect the “peripheral_aresetn” output of the “proc_sys_reset_0” Processor System Reset to the “M02_ARESETN” input of the “microblaze_0_axi_periph” interconnect.

The other slave interface of the AXI-PCIe block, S_AXI, provides access to the PCIe end-point address space. We want this port to be accessible to both the Microblaze and the CDMA, so we will add another AXI Interconnect to the design.

Add an “AXI Interconnect” from the IP Catalog to the block design. Rename it “pcie_intercon” and configure it to have 2 slave interfaces and 1 master interface.
Connect the “M00_AXI” interface of the “pcie_intercon” to the “S_AXI” interface of the AXI-PCIe block.
Now connect all the clocks and resets of the “pcie_intercon” as shown in the image below. Connect all the clock inputs to the “axi_aclk_out” output of the PCIe block. Connect the “ARESETN” input to the “interconnect_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset. Connect all other reset inputs to the “peripheral_aresetn” output of the “rst_axi_pcie_0_62M” Processor System Reset.
Connect the “M01_AXI” interface of the “cdma_intercon” to the “S00_AXI” interface of the “pcie_intercon”.
Connect the “M03_AXI” interface of the “microblaze_0_axi_periph” interconnect to the “S01_AXI” interface of the “pcie_intercon”.
Connect the “M03_ACLK” input of the “microblaze_0_axi_periph” interconnect to the “axi_aclk_out” output of the AXI-PCIe block.
Connect the “M03_ARESETN” of the “microblaze_0_axi_periph” interconnect to the “peripheral_aresetn” of the “rst_axi_pcie_0_62M” Processor System Reset block.

Add the other peripherals

To make our design “Linux ready”, we need to add four more blocks to our design:

UART – for console output
AXI Ethernet Lite – for network connection
AXI Quad SPI – for retrieval of FPGA configuration files, software and Linux kernel from a QSPI Flash
AXI Timer – Microblaze doesn’t have an integrated timer

We will add all 4 blocks to the design and then let the block automation feature handle the connection of these peripherals to the Microblaze.

Add an “AXI UART16550” from the IP Catalog to the block design.
Add an “AXI EthernetLite” from the IP Catalog to the block design.
Add an “AXI Quad SPI” from the IP Catalog to the block design.
Add an “AXI Timer” from the IP Catalog to the block design.
Click “Run Connection Automation” and select all of the connections for the 4 added peripherals.
They will all have been automatically connected to the “microblaze_0_axi_periph” interconnect as shown in the image below.
Connect the “ext_spi_clk” input of the AXI QSPI to the same clock as it’s “s_axi_aclk” input.
Double click on the “microblaze_0_xlconcat” interrupt concat and change the number of input ports to 6 – we need 4 more to connect the interrupts of our new peripherals.
One-by-one, connect the interrupt outputs of the peripherals to the inputs of the interrupt concat as shown in the image below. The interrupt output for the UART, AXI EthernetLite and AXI QSPI is called “ip2intc_irpt”. The interrupt output for the AXI Timer is called “interrupt”.

Add some debug signals

It’s always nice to have an LED light up to tell us that things are working correctly.

Right click on the “mmcm_lock” output of the AXI-PCIe block and select “Make External”.
Right click on the “init_calib_complete” output of the MIG and select “Make External”.

We will later add a constraint for each one of these ports to assign it to a specific LED on the KC705 board.

Assign addresses

Open the “Address Editor” tab and click the “Auto Assign Address” button.
All addresses should be assigned as in the image below.
By default, the AXI-PCIe control interface (S_AXI_CTL) is allocated 256M, but this will cause a problem for PetaLinux later on, so change it to 64M and then save the block design.

Create the HDL wrapper

Now the block diagram is complete, so we can save it and create a HDL wrapper for it.

Open the “Sources” tab from the Block Design window.
Right-click on “design_1” and select “Create HDL wrapper” from the drop-down menu.
From the “Create HDL wrapper” window, select “Let Vivado manage wrapper and auto-update”. Click “OK”.

Add the constraints

We must now add our constraints to the design for assignment of the PCIe integrated block, the gigabit transceivers, the reference clocks, the LEDs and a few other signals.

Download the constraints file from this link: Constraints for Microblaze PCIe Root Complex design
Save the constraints file somewhere on your hard disk.
From the Project Manager, click “Add Sources”.
Then click “Add or create constraints”.
Then click “Add files” and browse to the constraints file that you downloaded earlier. Select the constraints file, then click “OK”. Now tick “Copy constraints files into project” and click “Finish”.
You should now see the constraints file in the Sources window.

Finished at last!

In the next tutorial: Zynq

In the next part of this tutorial series, we will build another PCIe Root Complex design in Vivado, but this time for the Zynq. The target hardware will be the PicoZed 7Z030 and the PicoZed FMC Carrier Card V2.

Testing the project on hardware

In the third and final part of this tutorial series, we will run a stand-alone application on the hardware which will check the state of the PCIe link and enumerate the connected PCIe end-points. Then we will run PetaLinux on our hardware and make an NVMe PCIe SSD accessible under the operating system.

Sources Git repository

The sources for re-generating this project automatically can be found on Github here: FPGA Drive PCIe Root Complex design

Other useful resources

Here are some other useful resources for creating PCI Express designs:

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

Zynq PCI Express Root Complex design in Vivado

April 14, 2016, 5:01 am

≫ Next: Connecting an SSD to an FPGA running PetaLinux

≪ Previous: Microblaze PCI Express Root Complex design in Vivado

This is the second part of a three part tutorial series in which we will create a PCI Express Root Complex design in Vivado with the goal of connecting a PCIe NVMe solid-state drive to our FPGA.

Part 1: Microblaze PCI Express Root Complex design in Vivado

Part 2: Zynq PCI Express Root Complex design in Vivado (this tutorial)

Part 3: Connecting an SSD to an FPGA running PetaLinux

In this second part of the tutorial series, we will build a Zynq based design targeting the PicoZed 7Z030 and PicoZed FMC Carrier Card V2. In part 3, we will then test the design on the target hardware by running a stand-alone application which will validate the state of the PCIe link and perform enumeration of the PCIe end-points. We will then run PetaLinux on the FPGA and prepare our SSD for use under the operating system.

Requirements

To complete this tutorial you will need the following:

Vivado 2015.4
PicoZed 7Z030
PicoZed FMC Carrier Card V2
FPGA Drive adapter
An NVMe PCIe solid-state drive such as this one
A JTAG programmer such as Digilent HS3 JTAG

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

Design Overview

The diagram below shows the block design we are about to build with only the AXI interfaces showing. It shows three main elements: the Zynq PS, the AXI to PCIe bridge and the AXI CDMA. If you went through the previous tutorial where we created the same design for a Microblaze system, you may be wondering why the Zynq design seems so much simpler. The reason is that a lot of the elements required in this design are hidden in the Zynq PS block, including the DDR3 memory controller, UART, Ethernet, Interrupt controller, Timer and QSPI.

zynq_pcie_root_port_design_vivado_48

So again let’s look at who the bus masters are and what address spaces they can access:

the Zynq PS can access both the DDR3 memory and the PCIe address space
the PCIe end-point with bus mastering capability can access the DDR3 memory only (via M_AXI port of the AXI-PCIe bridge)
the CDMA can access both the DDR3 memory and the PCIe address space

Install PicoZed board definition files

The first thing we have to do is provide the PicoZed board definition files to our Vivado installation so that the PicoZed will show up in the list of targets when we create a new project. The board definition files contain information about the hardware on the target board and also on how the Zynq PS should be configured in order to properly connect to that hardware.

Download the PicoZed board definition files for Vivado 2015.4 from the PicoZed documentation page.
From inside the ZIP file, copy the folder picozed_7030_fmc2 into the folder C:\Xilinx\Vivado\2015.4\data\boards\board_files of your Vivado installation.

Create a new Vivado project

Let’s kick off the design by creating a new project in Vivado and selecting the PicoZed 7Z030 as our target.

From the welcome screen, click “Create New Project”. Specify a folder for the project. I’ve created a folder named “kc705_aximm_pcie”. Click “Next”.
For the Project Type window, choose “RTL Project” and tick “Do not specify sources at this time”. Click “Next”.
For the Default Part window, select the “Boards” tab and then select the “PicoZed 7030 SOM + FMC Carrier V2” and click “Next”.
Click “Finish” to complete the new project wizard.

Create the block design

In the following steps, we’ll create the block design then add the Zynq PS and the AXI Memory Mapped PCI Express Bridge.

From the Vivado Flow Navigator, click “Create Block Design”.
Specify a name for the block design. Let’s go with the default “design_1” and leave it local to the project. Click “OK”.
Once the empty block design opens, click on the “Add IP” icon. The IP catalog will appear. Find and double click on “ZYNQ7 Processing System”.
The Zynq PS block will be added to the block design. Click Run Block Automation to configure the Zynq PS for our target hardware.
Use the default block automation settings.
Now double click on the Zynq PS to configure it.
In the PS-PL Configuration tab, enable HP Slave AXI interface (S AXI HP0 interface). HP stands for high-performance and this port allows an AXI master to access the DDR3 memory.
In the Clock Configuration tab, disable the PL Fabric Clock FCLK_CLK0, because we won’t be needing it. Instead, most of our design will be driven by the clock supplied by the AXI-PCIe bridge, which is derived from the 100MHz PCIe reference clock.
In the Interrupts tab, enable Fabric Interrupts, IRQ_F2P. This allows us to connect interrupts from our PL (programmable logic) to the Zynq PS.
The Zynq block should look like in the image below.

Add the AXI MM to PCIe bridge

From the IP Catalog, add the “AXI Memory Mapped to PCI Express” block to the design.
When the AXI-PCIe block is in the block design, double click on it to configure it.
On the “PCIE:Basics” tab of the configuration, select “Root Port of PCI Express Root Complex” as the port type.
On the “PCIE:Link Config” tab, select a “Lane Width” of 1x and a “Link speed” of 5 GT/s (Gen2). We plan to connect to a 4-lane NVMe PCIe SSD in the next part of this tutorial, but the target hardware only has a single-lane PCIe edge connector.
In the “PCIE:ID” tab, enter a “Class Code” of 0x060400. This is important for the next tutorial, in which we will be running PetaLinux. The class code will ensure that the correct driver is associated with the AXI to PCIe bridge IP.
In the “PCIE:BARS” tab, set BAR 0 with type “Memory” and a size of 1 Gigabytes.
In the “PCIE:Misc” tab, use the defaults as shown in the image below.
In the “AXI:BARS” tab, use the defaults as shown in the image below. We will later be able to configure the size of the AXI BAR 0 in the Address Editor.
In the “AXI:System” tab, use the defaults as shown in the image below.
In the “Shared Logic” tab, use the defaults as shown in the image below. Click “OK”.
Right click on the “pcie_7x_mgt” port of the AXI-PCIe block and select “Make External”. This will connect the gigabit transceiver to the 1-lane PCIe edge-connector of the PicoZed 7030 SOM + FMC Carrier V2.
Right click on the “axi_aresetn” port of the AXI-PCIe block and select “Make External”. This will be connected to the PERST_N signal that is generated by the FPGA Drive adapter.
Rename the created port to “perst_n” using the External Port Properties window.
Add a “Constant” from the IP Catalog and configure it to output 0 (low). We’ll use this to tie low the “INTX_MSI_Request” input of the AXI-PCIe block. Connect the constant’s output to the “INTX_MSI_Request” input of the AXI-PCIe block.
Add a “Utility Buffer” to the block design. This buffer is going to be connected to a 100MHz clock that will be provided to the PicoZed 7030 SOM + FMC Carrier V2 by the FPGA Drive adapter, via the PCIe edge-connector. A 100MHz reference clock is required by all PCIe devices. Double click on the utility buffer and on the “Page 0” tab of the configuration window, select “IBUFDSGTE” as the C Buf Type. Click “OK”.
Connect the “IBUF_OUT” output of the utility buffer to the “REFCLK” input of the AXI-PCIe block.
Right click on the “CLK_IN_D” input of the utility buffer and select “Make External”, then change the name of the created external port to “ref_clk” using the External Interface Properties window.

Add the Processor System Resets

Our design will be using the two clocks supplied by the AXI-PCIe bridge: “axi_aclk_out” and “axi_ctl_aclk_out”. We will need to add a Processor System Reset to generate resets for each of those clocks.

From the IP Catalog, add a “Processor System Reset” to the design – this one should automatically be called “proc_sys_reset_0”.
Connect the “axi_ctl_aclk_out” output of the AXI-PCIe block to the “slowest_sync_clk” input of the “proc_sys_reset_0” Processor System Reset.
Connect the “mmcm_lock” output of the AXI-PCIe block to the “dcm_locked” input of the “proc_sys_reset_0” Processor System Reset.
Connect the “ext_reset_in” input of the “proc_sys_reset_0” Processor System Reset to the “perst_n” port.
From the IP Catalog, add another “Processor System Reset” to the design – this one should automatically be called “proc_sys_reset_1”.
Connect the “axi_aclk_out” output of the AXI-PCIe block to the “slowest_sync_clk” input of the “proc_sys_reset_1” Processor System Reset.
Connect the “mmcm_lock” output of the AXI-PCIe block to the “dcm_locked” input of the “proc_sys_reset_1” Processor System Reset.
Connect the “ext_reset_in” input of the “proc_sys_reset_1” Processor System Reset to the “perst_n” port.

Add the CDMA

We’re going to add a Central DMA to this design to allow us to make DMA transfers between the PCIe end-point and the DDR3 memory. We won’t actually test it, that will be the subject of another tutorial, but most PCIe designs can benefit from having a Central DMA because it allows for higher throughput over the PCIe link using burst transfers.

Add an “AXI Central Direct Memory Access” from the IP Catalog to the block design.
Double click on the CDMA block to open the configuration window. Disable Scatter Gather and set “Write/Read Data Width” to 128 as shown in the image below.

Add the Interrupt Concat

To connect interrupts to the IRQ_F2P port of the Zynq PS, we need to use a Concat.

From the IP Catalog, add a “Concat” to the block design.
By default, it should have two inputs – that’s perfect for us, as we only have 2 interrupts to connect. Connect the output of the Concat to the “IRQ_F2P” port of the Zynq PS.
Connect the “interrupt_out” output of the AXI-PCIe block to the “In0” input of the Concat.
Connect the “cdma_introut” output of the CDMA block to the “In1” input of the Concat.

Add the AXI Interconnects

Now the last thing to do is add the AXI Interconnects and wire up all the AXI interfaces.

axi_interconnect_0:

From the IP Catalog, add an “AXI Interconnect” to the block design – this one should be automatically named “axi_interconnect_0”. We’ll use this to create two ports for accessing the DDR3 memory.
Re-configure it to have 2 slave ports and 1 master port.
Connect the “M00_AXI” port of the “axi_interconnect_0” to the “S_AXI_HP0” port of the Zynq PS.

axi_interconnect_1:

From the IP Catalog, add another “AXI Interconnect” to the block design – this one should be automatically named “axi_interconnect_1”. We’ll use this to create two ports for accessing the AXI-PCIe control interface, the PCIe end-point and the CDMA control interface.
Re-configure it to have 2 slave ports and 3 master ports.
Connect the “M00_AXI” port of the “axi_interconnect_1” to the “S_AXI” port of the AXI-PCIe block.
Connect the “M01_AXI” port of the “axi_interconnect_1” to the “S_AXI_CTL” port of the AXI-PCIe block.
Connect the “M02_AXI” port of the “axi_interconnect_1” to the “S_AXI_LITE” port of the CDMA block.

axi_interconnect_2:

From the IP Catalog, add another “AXI Interconnect” to the block design – this one should be automatically named “axi_interconnect_2”. We’ll use this to allow the CDMA to access both the DDR3 memory and the PCIe end-point.
By default, it should already have 1 slave port and 2 master ports, which is exactly what we need.
Connect the “M00_AXI” port of the “axi_interconnect_2” to the “S01_AXI” port of the “axi_interconnect_0” (the first interconnect we created).
Connect the “M01_AXI” port of the “axi_interconnect_2” to the “S01_AXI” port of the “axi_interconnect_1” (the second interconnect we created).

Now for the rest of the connections:

Connect the “M_AXI” port of the CDMA block to the “S00_AXI” port of the “axi_interconnect_2”.
Connect the “M_AXI_GP0” port of the Zynq PS to the “S00_AXI” port of the “axi_interconnect_1”.
Connect the “M_AXI” port of the AXI-PCIe block to the “S00_AXI” port of the “axi_interconnect_0”.

Connect all the clocks

Let’s start by hooking up the main clock axi_aclk_out:

Connect “axi_aclk_out” clock to the “M_AXI_GP0_ACLK” and “S_AXI_HP0_ACLK” inputs of the Zynq PS.
Connect “axi_aclk_out” clock to the “m_axi_aclk” and “s_axi_lite_aclk” inputs of the CDMA.
Connect “axi_aclk_out” clock to the “ACLK”, “S00_ACLK”, “M00_ACLK” and “S01_ACLK” inputs of the “axi_interconnect_0” (ie. all of the clock inputs).
Connect “axi_aclk_out” clock to the “ACLK”, “S00_ACLK”, “M00_ACLK”, “S01_ACLK” and “M02_ACLK” inputs of the “axi_interconnect_1” (notice that we do not connect “M01_ACLK” yet!).
Connect “axi_aclk_out” clock to the “ACLK”, “S00_ACLK”, “M00_ACLK” and “M01_ACLK” inputs of the “axi_interconnect_2” (ie. all of the clock inputs).

Now the control clock axi_ctl_aclk_out:

Connect “axi_ctl_aclk_out” clock to the “M01_ACLK” input of the “axi_interconnect_1”.

Connect all the resets

Connect the “interconnect_aresetn” output of the “proc_sys_reset_1” Processor System Reset to the “ARESETN” input of ALL 3 AXI Interconnects.
Connect the “peripheral_aresetn” output of the “proc_sys_reset_1” Processor System Reset to the following inputs:
1. CDMA input “s_axi_lite_aresetn”
2. “axi_interconnect_0” inputs “S00_ARESETN”, “M00_ARESETN” and “S01_ARESETN”
3. “axi_interconnect_1” inputs “S00_ARESETN”, “M00_ARESETN”, “S01_ARESETN” and “M02_ARESETN” (notice that we do not connect “M01_ARESETN” yet!)
4. “axi_interconnect_2” inputs “S00_ARESETN”, “M00_ARESETN” and “M01_ARESETN”
Connect the “peripheral_aresetn” output of the “proc_sys_reset_0” Processor System Reset to the “M01_ARESETN” of “axi_interconnect_1”.

Assign addresses

Open the “Address Editor” tab and click the “Auto Assign Address” button.
There will be an error generated because Vivado will try to assign 1G to the PCIe BAR0 and 256M to the PCIe control interface (CTL0). Change the size of PCIe BAR0 to 256M and use the “Auto Assign Address” button again. It should succeed this time and you will have the addresses shown below.
Finally, we’ll need to set the size of the PCIe control interface to 64M, to avoid a memory allocation problem in PetaLinux later.

Create the HDL wrapper

Now the block diagram is complete, so we can save it and create a HDL wrapper for it.

Open the “Sources” tab from the Block Design window.
Right-click on “design_1” and select “Create HDL wrapper” from the drop-down menu.
From the “Create HDL wrapper” window, select “Let Vivado manage wrapper and auto-update”. Click “OK”.

Add the constraints

We must now add our constraints to the design for assignment of the PCIe integrated block, the gigabit transceivers, the reference clocks and a few other signals.

Download the constraints file from this link: Constraints for Zynq PCIe Root Complex design
Save the constraints file somewhere on your hard disk.
From the Project Manager, click “Add Sources”.
Then click “Add or create constraints”.
Then click “Add files” and browse to the constraints file that you downloaded earlier. Select the constraints file, then click “OK”. Now tick “Copy constraints files into project” and click “Finish”.
You should now see the constraints file in the Sources window.

You’re all done!

Testing the project on hardware

In the next and final part of this tutorial series, we will test our design on hardware by connecting an NVMe PCIe SSD to our FPGA. We’ll start by running a simple stand-alone application that will check the PCIe bus status and enumerate the end-points. Then we’ll generate a PetaLinux build that is customized to our hardware and we’ll bring up the SSD from the command line.

Sources Git repository

The sources for re-generating this project automatically can be found on Github here: FPGA Drive PCIe Root Complex design

Other useful resources

Here are some other useful resources for creating PCI Express designs:

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

Connecting an SSD to an FPGA running PetaLinux

April 15, 2016, 10:43 am

≫ Next: Using AXI DMA in Vivado Reloaded

≪ Previous: Zynq PCI Express Root Complex design in Vivado

This is the final part of a three part tutorial series on creating a PCI Express Root Complex design in Vivado and connecting a PCIe NVMe solid-state drive to an FPGA.

Part 1: Microblaze PCI Express Root Complex design in Vivado

Part 2: Zynq PCI Express Root Complex design in Vivado

Part 3: Connecting an SSD to an FPGA running PetaLinux (this tutorial)

In this final part of the tutorial series, we’ll start by testing our hardware with a stand-alone application that will verify the status of the PCIe link and perform enumeration of the PCIe end-points. We’ll then run PetaLinux on the FPGA and prepare our SSD for use under the operating system. PetaLinux will be built for our custom hardware using the PetaLinux SDK and the Vivado generated hardware description. Using Linux commands, we will then create a partition, a file system and a file on the solid-state drive.

This part of the tutorial applies to both the Microblaze and Zynq designs developed in the previous tutorials. Where the instructions differ between the designs, they are split into two branches.

Requirements

To complete this tutorial you will need the following:

Vivado 2015.4
PetaLinux SDK 2015.4
Putty (or similar terminal program)
For the Microblaze design:
- KC705 Evaluation Board
For the Zynq design:
- PicoZed 7Z030
- PicoZed FMC Carrier Card V2
- A JTAG programmer such as Digilent HS3 JTAG
FPGA Drive adapter and supplied power splitter
An NVMe PCIe solid-state drive such as this one

Note: The tutorial text and screenshots are suitable for Vivado 2015.4 however the sources in the Git repository will be regularly updated to the latest version of Vivado.

Tool Setup for Windows users

PetaLinux SDK 2015.4 only runs in the Linux operating system, so Windows users (like me) have to have two machines to follow this tutorial. You can either have two physical machines, which is how I work, or you can have one Windows machine and one Linux virtual machine. In this tutorial, I will assume that you have two physical machines, one running Windows and the other running Linux. My personal setup uses Windows 7 and Ubuntu 14.04 LTS on two separate machines.

If you are building your Linux setup for the first time, here are the supported OSes according to the PetaLinux SDK Installation guide:

RHEL 5 (32-bit or 64-bit)
RHEL 6 (32-bit or 64-bit)
SUSE Enterprise 11 (32-bit or 64-bit)

Note: I had problems installing PetaLinux SDK 2015.4 on 32-bit Ubuntu, as did others, so I use 64-bit Ubuntu and I haven’t had any problems with my setup.

Setup the hardware: KC705

The KC705 Evaluation Board must be setup as shown in the image below. It is strongly recommended that you make the connections in the precise order described below.

connecting_ssd_to_fpga_running_petalinux_125

Connect the M.2 PCIe SSD to the FPGA Drive adapter, and tighten the fixing screw
Connect the FPGA Drive to the KC705 PCI Express edge-connector. Do NOT put pressure on the M.2 SSD while doing this.
Connect the input of the power splitter (comes with FPGA Drive) to the power adapter that was supplied with the KC705
Connect one branch of the power splitter to the KC705 power connector (J49).
Connect the other branch of the power splitter to the FPGA Drive power connector
Connect a USB cable between your PC and the UART port of the KC705
Connect a USB cable between your PC and the JTAG port of the KC705
Set DIP switches (SW13) to 11101 (this is for configuration by JTAG, see UG810 page 73)

Setup the hardware: PicoZed and PicoZed FMC Carrier Card V2

The PicoZed 7Z030 and PicoZed FMC Carrier Card V2 must be setup as shown in the image below. It is strongly recommended that you make the connections in the precise order described below.

connecting_ssd_to_fpga_running_petalinux_121

Insert the PicoZed into the SoM socket of the PicoZed FMC Carrier Card V2
Connect the M.2 PCIe SSD to the FPGA Drive adapter, and tighten the fixing screw
Connect the FPGA Drive to the PicoZed FMC Carrier Card V2 PCI Express edge-connector. Do NOT put pressure on the M.2 SSD while doing this.
Connect the input of the power splitter (comes with FPGA Drive) to the power adapter that was supplied with the PicoZed FMC Carrier Card V2
Connect one branch of the power splitter to the PicoZed FMC Carrier Card V2 power connector (J2).
Connect the other branch of the power splitter to the FPGA Drive power connector
Connect a USB cable between your PC and the UART port of the PicoZed FMC Carrier Card V2
Connect a JTAG programmer between your PC and the JTAG port (J7) of the PicoZed FMC Carrier Card V2
Set DIP switches (SW1) to 00 (this is for configuration by JTAG, see PicoZed 7015/7030 User guide table 13)

Regenerate the Vivado project

If you did not follow either of the previous tutorials, and you do not have a completed Vivado project, then follow these instructions to regenerate the Vivado project from scripts. Please note that the Git repository is regularly updated for the latest version of Vivado, so you must download the last “commit” for the version of Vivado that you are using.

Download the sources from Github here: https://github.com/fpgadeveloper/fpga-drive-aximm-pcie
Depending on your operating system:
- If you are using a Windows machine, open Windows Explorer, browse to the “Vivado” folder within the sources you just downloaded. Double-click on the build-<your target platform>.bat file to run the batch file.
- If you are using a Linux machine, run Vivado and then select Window->Tcl Console from the welcome screen. In the Tcl console, use the “cd” command to navigate to the “Vivado” folder within the sources you just downloaded. Then type source build-<your target platform>.tcl to run the build script.
Once the script has finished running, the Vivado project should be regenerated and located in the “Vivado” folder. Run Vivado and open the newly generated project.

Note: You must replace <your target platform> with kc705 or pz-7z030, depending on the target hardware you are using.

Generate a bitstream

The first thing we’ll need to do is to generate a bitstream from the Vivado project we created in the earlier tutorials.

Open the project in Vivado.
From the Flow Navigator, click “Generate Bitstream”.
Depending on your machine, it will take several minutes to perform synthesis and implementation. In the end, you should see the following message. Just select “View Reports” and click OK.
Now we need to use the “Export to SDK” feature to create a hardware description file (.hdf) for the project. From the menu, select File->Export->Export Hardware.
In the Export Hardware window, tick “Include bitstream” and choose “Local to Project” as the export location.

Launch Xilinx SDK

At this point it’s best to launch the Xilinx SDK from Vivado, because it will automatically setup our SDK workspace with a hardware platform based on our project’s hardware description file.

From the menu, select File->Launch SDK.
Specify both the exported location and workspace as “Local to Project”.
The SDK should automatically create the hardware platform (design_1_wrapper_hw_platform_0) for you, and you should see it in the Project Explorer as seen in the image below – first image is for the KC705, and the second image is for the PicoZed.
Now we want to create a template software application, so that we can simply insert code and run it. From the menu, select File->New->Application Project.
In the New Project window, type “pcie_test” as the Project name and click Next. The right Processor for your hardware should already be selected. The image below shows ps7_cortexa9_0 for the PicoZed, but for the KC705, it will be microblaze_0.
Select the “Hello World” template and click Finish.
You should now see the software application “pcie_test” and the BSP “pcie_test_bsp” added to your workspace in the Project Explorer.
Now we need to get the code to test our PCIe link. We will use an example from Xilinx which you can find in the Xilinx SDK installation folders at this location: C:\Xilinx\SDK\2015.4\data\embeddedsw\XilinxProcessorIPLib\drivers\axipcie_v3_0\examples\xaxipcie_rc_enumerate_example.c
In Windows Explorer, browse to the above “examples” folder, right click on the source file and select “Copy”.
Now return to Xilinx SDK and open the “pcie_test” tree to reveal the “src” folder. Now right click on the “src” folder and select “Paste”. This will copy the source file into our application.
Now select the “helloworld.c” file in the “src” folder and press Del to delete the file.
SDK will automatically rebuild the software application.

Now we are ready to run the stand-alone application on the hardware.

Run the stand-alone application

Power up the hardware:

First switch ON the SSD – Then switch ON your FPGA platform
Open Putty or a similar terminal program to receive the console output from the UART.
Check your device manager to find the USB-UART, and it’s comport. The example below shows COM16. If you don’t find one, then ensure that you have a USB cable between the PC and the UART port of your FPGA board.
In Putty, open a new session using the comport that you just located and the following settings:
- Baud rate: 9600bps
- Data: 8 bits
- Parity: None
- Stop bits: 1
Now returning to the SDK, from the menu, select Xilinx Tools->Program FPGA.
In the Program FPGA window, we select the hardware platform to program. We have only one hardware platform, so click “Program”. The image below, taken for the KC705 design, shows that the Microblaze will be loaded with the bootloop program. If you are using the PicoZed, you will not be loading the processor with anything and so this line will be blank.
The bitstream will be loaded onto the FPGA and we are ready to load the software application. Select the “pcie_test” folder in the Project Explorer, then from the menu, select Run->Run.
In the Run As window, select “Launch on Hardware (GDB)” and click “OK”.
The application will be loaded on the processor and it will be executed. The terminal window should display this output:

The console output shows that the PCIe “Link is up” and it has enumerated the PCIe bridge and an end-point with Vendor ID 0x144D.

Build PetaLinux

Now that we have validated our hardware, let’s get started using the PetaLinux SDK on our Linux machine.

On your Linux machine, start a command terminal.
Type source /<your-petalinux-install-dir>/settings.sh into the terminal and press Enter. Obviously you must insert the location of your PetaLinux installation.
For consistency, let’s work from a directory called projects/fpga-drive-aximm-pcie in your home directory. Create that directory and then “cd” to it.
Use a USB stick or another method to copy the entire Vivado project directory (should be kc705_aximm_pcie for the KC705, pz_7z030_aximm_pcie for the PicoZed) from your Windows machine onto your Linux machine. Place it into the directory we just created.
Create a PetaLinux project using this command:
- For KC705: petalinux-create --type project --template microblaze --name petalinux_prj
- For PicoZed: petalinux-create --type project --template zynq --name petalinux_prj
Change to the “petalinux_prj” directory in the command terminal.

Stay in the PetaLinux project folder from here on. It is important that all the following commands are run from the PetaLinux project folder that we just created.
Import the Vivado generated hardware description into our PetaLinux project with the command:
- For KC705: petalinux-config --get-hw-description ../kc705_aximm_pcie/kc705_aximm_pcie.sdk/
- For PicoZed: petalinux-config --get-hw-description ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.sdk/
The Linux System Configuration will open, but we don’t have any changes to make here, so simply exit and save the configuration.
Configure the Linux kernel with the command: petalinux-config -c kernel
Now we use the kernel configuration menu to enable PCI support and enable the driver for NVM Express devices:
- For KC705:
  - Enable: Bus options->PCI support
  - Enable: Bus options->PCI support->Message Signaled Interrupts (MSI and MSI-X)
  - Enable: Bus options->PCI support->Enable PCI resource re-allocation detection
  - Enable: Bus options->PCI support->PCI host controller drivers->Xilinx AXI PCIe host bridge support
  - Enable: Device Drivers->Block devices->NVM Express block device
- For PicoZed:
  - Check: Bus options->PCI support should already be enabled by default
  - Check: Bus options->PCI support->Message Signaled Interrupts (MSI and MSI-X) should already be enabled by default
  - Check: Bus options->PCI support->Enable PCI resource re-allocation detection should already be enabled by default
  - Check: Bus options->PCI support->PCI host controller drivers->Xilinx AXI PCIe host bridge support should already be enabled by default
  - Enable: Device Drivers->Block devices->NVM Express block device
To configure the Linux root file system, run the command: petalinux-config -c rootfs
Configure the root file system to include some utilities we will need to setup the NVMe PCIe SSD:
- Enable PCI utils (for lspci): Filesystem Packages->console/utils->pciutils->pciutils
- Enable required packages for lsblk, fdisk, mkfs, blkid:
  - Filesystem Packages->base->util-linux->util-linux
  - Filesystem Packages->base->util-linux->util-linux-blkid
  - Filesystem Packages->base->util-linux->util-linux-fdisk
  - Filesystem Packages->base->util-linux->util-linux-mkfs
  - Filesystem Packages->base->util-linux->util-linux-mount
  - Filesystem Packages->base->e2fsprogs->e2fsprogs
  - Filesystem Packages->base->e2fsprogs->e2fsprogs-mke2fs
Build PetaLinux using the command: petalinux-build

PetaLinux will take a few minutes to build depending on your machine.

Boot PetaLinux over JTAG

There are many ways to boot PetaLinux on the hardware, but to avoid going through the details of setting up a flash or SD card boot, we will use the JTAG method for this tutorial.

Power up your hardware:

First switch ON the SSD – Then switch ON your FPGA platform
Open a new session in Putty again, but this time, use a baud rate of 115200bps:
- Baud rate: 115200bps
- Data: 8 bits
- Parity: None
- Stop bits: 1
Boot PetaLinux using these commands:
- For KC705, we load the bitstream then the kernel:
  - petalinux-boot --jtag --fpga --bitstream ../kc705_aximm_pcie/kc705_aximm_pcie.runs/impl_1/design_1_wrapper.bit
  - petalinux-boot --jtag --kernel
- For PicoZed, we package everything, then load it all:
  - petalinux-package --boot --fsbl ./images/linux/zynq_fsbl.elf --fpga ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.runs/impl_1/design_1_wrapper.bit --uboot --force
  - petalinux-package --prebuilt --fpga ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.runs/impl_1/design_1_wrapper.bit
  - petalinux-boot --jtag --prebuilt 3 --fpga --bitstream ../pz_7z030_aximm_pcie/pz_7z030_aximm_pcie.runs/impl_1/design_1_wrapper.bit
It will take several minutes before the kernel has been transferred via JTAG. Wait for the command line to return, then it can take another 10-20 seconds before you see any output on the Putty terminal.
PetaLinux will boot and you should see the boot log on the Putty terminal window.

If you want to see the complete boot logs, here they are:

How to setup the NVMe SSD in PetaLinux

Log into PetaLinux using the username root and the password root.
Check that the SSD has been enumerated using: lspci. Without any arguments, you get the output as shown in the image below. By using the -vv argument, you get a more detailed output which tells you the link speed and how many lanes are being used, among other things. See the more detailed output here: lspci -vv for KC705 and lspci -vv for PicoZed.
Check that the SSD has been recognized as a block device using: lsblk.
Create a partition on the SSD using: fdisk /dev/nvme0n1.
- Type “n” to create a new partition
- Then type “p”, then “1” to create a new primary partition
- Use the defaults for the sector numbers
- Then type “w” to write the data to the disk
Run lsblk again to get the name of the partition created. As you see in the image below, it is nvme0n1p1.
Create a file system on the new partition using: mkfs -t ext2 /dev/nvme0n1p1. This will take a few minutes.
Make a directory to mount the file system to using: mkdir /media/nvme.
Mount the SSD to that directory: mount /dev/nvme0n1p1 /media/nvme.
Change to the /media/nvme directory.
Create a file called test.txt using vi test.txt.
In VI, press the capital letter “I” (as in India) to start adding text to the file.
Now type The Matrix has you... into the file, press Esc and then type “:x” (colon, then the letter x) to save the file and quit.
Now use ls to see that the file is there.

Reboot

Let’s shut it all down and re-boot so that we can check that our file is still there after powering down.

Use poweroff to shutdown Linux.
Power down the hardware.
Run through the steps to Boot PetaLinux over JTAG, until you have logged in again as root.
Create a directory to mount the SSD to again: mkdir /media/nvme.
Mount the SSD to that directory: mount /dev/nvme0n1p1 /media/nvme.
Change to the /media/nvme directory.
Check that the file is still there using: ls.
Display the file using: cat test.txt.

What now?

Here are some interesting things you can explore which will be topics for future tutorials:

Using hdparm to measure the read/write speeds of the SSD
Creating a PetaLinux root file system on the SSD
Booting PetaLinux from the SSD

Source code Git repository

The sources for re-generating this project automatically can be found on Github here: FPGA Drive PCIe Root Complex design

Other useful resources

Here are some other useful resources for creating PCI Express designs:

If you have any questions about this tutorial, or if you run into problems, please leave me a comment below.

↧

Using AXI DMA in Vivado Reloaded

October 10, 2017, 6:44 pm

≫ Next: Getting Started with the MYIR Z-turn

≪ Previous: Connecting an SSD to an FPGA running PetaLinux

The DMA is one of the most critical elements of any FPGA or high speed computing design. It allows data to be transferred from source to memory, and memory to consumer, in the most efficient manner and with minimal intervention from the processor. It’s no wonder then that a tutorial I wrote three years ago about using the AXI DMA IP, is still relevant and still getting thousands of visits per month. I decided to remake that tutorial, this time as a video and using Vivado 2017.2 (just today they released Vivado 2017.3, doh!). Although I prefer doing written tutorials, I think that video tutorials can be very useful in their own way, and they’re a hell of a lot easier for me to produce. I hope you find this one useful.

Video transcript:

Hi I’m Jeff. In this video I’m show you how to a simple example of using the AXI DMA in Vivado. This is going to be based on a tutorial that I did in 2014. Now I’m going to refer to this diagram a few times. So to tell you a little bit about DMA, DMA is basically an interface between a data producer or a consumer, and a memory controller, so you’d need a DMA if for example you had data coming in from an ADC and you need to store it very quickly into memory. Or in the other case when you have a DAC and you have data in a memory and you need to send that data as quickly as possible through to your DAC. In both of these cases, you could always use the processor to do this job so the DMA is not the only solution but obviously using the processor to transfer data from one place to another is very time consuming for the processor and it’s a bit of a waste of the processor. The processor should be left to do intelligent things. So the DMA is really a hardware solution for transferring data from one place to another and it is the most efficient way of doing so.

So I’ll get into the example now. In Vivado we start by creating project. Now I’m going to base this one on the MicroZed 7010 so I’ll call the project mz_7010_dma_test. Next. Now it’s an RTL project and I won’t specify sources at this time because I don’t have sources for this. Here we select the board, so I’m going to select MicroZed 7010 and all that’s going to do is configure this project for the right part depending on the hardware we’re using. So I click finish.

Now I can click Create Block Design, I’ll leave it with the default name. And the first thing I do is add my Zynq processing system to the design. I’m going to click run block automation. Now what the block automation’s going to do in the beginning is apply the board preset on the processing system. So that’s going to configure the Zynq PS for the hardware we’re using. So depending on the board preset that we chose earlier, which means the DDR and whatever other hardware devices we have connected to the Zynq, well the board preset should configure that. So I click OK.

Now in the block diagram I can see that the DDR interface has been externalized, so I know that my Zynq is configured with the DDR to which it is connected to on the MicroZed. Now I also have FIXED_IO port, that’s for all the other devices that are wired to the Zynq on the MicroZed, so if I want to know what they are I could just click on the Zynq PS and have a look at the block diagram here. I can see that I have a UART connected, because I know the MicroZed has a USB UART on there, GPIO probably has some LEDs, has an SD card, so an SDIO interface for that, USB port and Ethernet, so that’s one of the Gigabit Ethernet MACs that’s enabled there and connected to the Gigabit Ethernet PHY and RJ45 connector that’s on the MicroZed. So the other thing the board preset is going to do for us is configure a clock for us, so we can see here that we have one of the fabric clocks that has been configured to 100MHz, so I’m going to use that clock for all of this design. OK so now what I want to do in this design because I have an AXI DMA, AXI DMA is going to need access to the DDR memory controller and it’s also going to need a configuration interface which is AXI lite, so the processor is going to need to configure the AXI DMA through an AXI lite port and the AXI DMA is going to need access to the DDR. That’s the important thing to know because I need to configure the Zynq PS for those things. So If I go to my Zynq block design, firstly for the DMA configuration, the Zynq is going to need a general purpose AXI master port so that it can configure the DMA, when I say configure the DMA, I mean setup DMA transfers and trigger them. That’s what the AXI slave port is for and that’s what I’m going to enable here, so the easy way to do it is I click on this block here and Vivado takes me through to the right setting that I have to enable, so here I can see general purpose master AXI interface and I click, I tick that to enable one, the GP0 interface. So again, that’s my interface that I’m going to use to configure the DMA from the processor. The other interface that I need is to access the DDR controller, the memory controller, and I can see from this diagram what I need to enable, is one of these high performance AXI slave ports. So that would allow the DMA to read and write from the DDR. So to enable one of these, I have to click on that. Tick one of those, there’s four of them, I only need one. So now I’ve got those two ports. The only other thing that I need to configure here, is the interrupts, so I need to enable fabric interrupts, because I’m going to be receiving interrupts from the DMA IP. So I have to enable this IRQ_F2P which means FPGA to Processor system. I enable that. Click OK. So now my Zynq is properly configured. I can start off by connecting the fabric clock, the 100MHz clock through to these AXI interfaces. My two AXI interfaces, the general purpose master AXI interface, and the high performance slave. So here is my high performance slave and here is my general purpose master. So that’s that, now what I can do is add my DMA. There’s a few DMAs there for different applications, the one that I want though is “AXI Direct Memory Access”. So now what I can see is I’ve got an AXI lite interface, that’s for configuration of the DMA, setting up DMA transfers from the processor, and I’ve got these other interfaces here which … M_AXI or just AXI itself is just an AXI memory mapped interface, whereas AXIS is going to be the AXI streaming interfaces. So these AXI memory mapped interfaces are going to need to go through to the DDR controller, or to the high performance port. Whereas the AXI lite is going to need to go to the general purpose AXI master. Then we have the streaming interfaces and in our case, what we want to do is we want to connect the streaming interfaces through an AXI Data FIFO so we can loop back the data, so that way our application is going to just send data from the memory through the DMA, to the FIFO, that’s going to be looped back to the DMA and be written back into the memory, so the processor can just verify that the transfer was successful by comparing the data that was sent and the data that was received. So that only leaves two ports here, which are the AXI streaming status and AXI streaming control ports. We don’t actually need those, those are used in Ethernet applications, so we’re going to disable them. So that gets rid of them, so now I can start connecting my interfaces, so I’m going to run connection automation. I’ll tick on first the AXI lite interface, slave interface. It’s always good when you’re using connection automation to check what Vivado wants to connect things to, but here I can see it wants to connect the AXI lite interface through to the processing system’s general purpose AXI master port – that’s correct, that’s what we want. Now it can’t really make a mistake here because there is only one master AXI interface configured on the Zynq at the moment, the other is a slave interface, the high performance port is a slave interface, so it’s not going to use that. So that’s right, now I can tick on the high performance slave AXI interface of the PS. Vivado wants to connect it to the scatter gather, AXI master interface of the DMA. Now it could’ve chosen any of the other AXI master interfaces, it chose scatter gather, it doesn’t really matter because it’s just going to create either an AXI interconnect, or an AXI smartconnect for this and then the other two ports will be able to go through that as well. So anyway we start things off like that.

OK now if I just run through quickly what that’s done. We just want to make sure that our general purpose AXI master port, that’s going through now to an AXI interconnect, which is called peripheral, so that’s for all of your peripherals. It goes through here. Out here. And it should go through to the AXI lite interface of the DMA, so thats for configuration of the DMA, and for triggering and setting up DMA transfers. So what about the master interfaces of the AXI DMA. We’ve only connected one of them for now, the scatter gather interface, so that goes through here, through to an AXI smart connect, and then this should go through to our high performance slave interface which is basically access to the DMA .. not to the DMA to the DDR sorry, the memory. That’s what that has done. I’m going to run connection automation again to hook up my last two AXI master interfaces of the DMA, that’s going to connect both of them through to the high performance slave port, so click OK. And here they are. And all of that’s done is opened up two more ports on this AXI smartconnect. OK so now I’ve hooked up those things, that leaves my AXI streaming interfaces to connect. So if I go back to the diagram, I’m talking about these two interfaces, so I need to add my AXI data FIFO and connect up my AXI streaming interfaces. So I go plus, FIFO, I want an AXI4 Stream data FIFO. OK and I want to connect the AXI streaming master interface through to the AXI streaming slave interface of the DMA. And I want to connect the AXI streaming master interface of the DMA through to the AXI streaming slave interface of the FIFO. So that’s going to be my loopback, so the data’s going to come out of here, memory mapped to streaming, it’s going to go through the FIFO and it’s going to come out of the FIFO and back into the DMA, the streaming 2 memory mapped interface and be written again to the DDR memory. So what about these things here the FIFO needs to be clocked, we’re going to use the same 100MHz clock that everything else uses. So I hook that through to there. And for the reset, I want to use the reset that the rest of my design is using which is generated by the automatically generated processor system reset. The source of which is the fabric clock reset here. So that’s what I’m going to use. So now that’s all connected properly the only thing that I have to connect up now are the two interrupts of the DMA. I need to connect them through to here, the IRQ_F2P. The way to do that is to use a Concat. So the output of my concat has to go to there. And then my two interrupts have to go to there. And my interrupts are now connected. So that’s my design and I can save the block design now. I can click validate design, to make sure that I haven’t made any mistakes or forgotten to connect any clocks or resets. OK so this is an intermittent problem that sometimes happens with MicroZed designs, it’s something that started I think a couple of versions ago, but you can safely ignore these messages, they’re basically coming from the board preset and Vivado’s complaining about them now whereas it didn’t complain about them at all in previous versions. Anyway so we’ll just ignore those. And save the design again.

For more info regarding this issue, checkout this forum post: https://forums.xilinx.com/t5/Design-Entry/Vivado-critical-warning-when-creating-hardware-wrapper/td-p/762938

So now the only thing I have to do is to generate my HDL wrapper. So I click on that and I say let Vivado manage wrapper and auto update. Now the only thing I have to do is generate the bitstream.

OK so my bitstream has been generated. I’m going to tick on view reports because I don’t want to open the implemented design. Now what I have to do is I want to bring this hardware design into the SDK so I can run a software application on it and test out the hardware. So to do that I have to say File-Export-Export Hardware and “include the bitstream”. I’ll export it local to the project. I click OK. So now the hardware’s been exported for SDK, I just have to run SDK, so the easy way to do that is go File-Launch SDK. And I exported it local to the project, my SDK workspace I’ll also leave it local to the project, it doesn’t really change much for me here. So OK.

So the SDK workspace at the moment has nothing in it, it should have nothing in it except my hardware platform specification. Which is here. It’s got the name of the block design. So I’ve got to add my application to the SDK at this point. So the way I do that is I say File-New-Application project. Now I can call this dma_test like that. If I just look into what’s going on here, here you can choose the processor that the application is going to run on. Now because the Zynq on the MicroZed has a dual-core ARM processor so we can choose which one we want to use, I’ll just choose that one. You obviously have to specify the hardware platform, or the hardware platform which is defined here. So there’s only one that I’ve got to choose from so that’s why it’s choosing that. Then I click next. What I’m going to do is I’m going to use an empty application for this, that’s going to be an application with no code, I’ve got to supply the code which I’ll do. So I say finish. So in my dma_test application here I’ve got no sources just the linker script and a readme. So to do this, I’m going to add my software application. What I’m going to use as a software application is an example software application that is provided by Xilinx. It is in this folder here Xilinx SDK version number, data, embeddedsw, Xilinx processor IP lib drivers, AXI DMA, examples (actual folder C:\Xilinx\SDK\2017.2\data\embeddedsw\XilinxProcessorIPLib\drivers\axidma_v9_3\examples). So this is in your Xilinx installation files. So what I want to do is use the example scatter gather poll, to begin with, let’s try that. So if I take this file, maybe I can drag it over here. Copy files. Click OK. So that’s copied that into my software application and now.. project build automatically,so I’m using the build automatically setting, so it should have built that project automattically. So let’s try and run that application.

So first of all, you’ve got to make sure that you’ve set the jumpers correctly for the configuration of the Zynq. So here I’ve got the jumpers set for configuration by JTAG. So here I have my JTAG programmer here. Here’s my JTAG programmer that I’m going to use, I’ll just plug it in. And now I can plug in my USB cable. Plug into here. And when I do that, the MicroZed you’ll notice that the LEDs turned ON, that’s because it’s getting it’s power from the USB port. So now what I can do is go back into the SDK, click Xilinx Tools, Program FPGA. Now that’s loaded the bitstream into the FPGA on the Zynq. Now I have to do is run the application, but before I run the application, I’m going to setup a connection to it. So I’ve already set that up earlier, so here is my COMPORT terminal window, connected through to the MicroZed, so when I run my application, I should see some, I should see some text coming up onto my console window. So to run the test, I’m going to click on dma_test application, click on run configurations, and I want to use the System Debugger for this, so I double click on that. And I can then click Run. And it will run the application on the hardware, I can see here that the application was successful.

Now just one last thing, I’ll create another application. I want to run another application but this time using the application using the example application that has interrupts. I’ll again choose an empty application, and move, and copy the application code into the workspace. OK so now I can see in the application I have no code. So I want to grab this, the scatter gather example with interrupts. I’ll drag it over into my application. OK now it’s going to crash (meant to say: fail to compile!). If you go and look and see why it crashed (meant to say: failed!), you’ll find that there are a couple of defines here that Vivado (the SDK) can’t find. So if I hover of that it says to me that this define is undeclared “first use in this function”. SO this is a define that should be in xparameters.h in the BSP. So I’m going to open up the BSP and see why that define isn’t there, and maybe change the, maybe it’s changed names. So I’ll go into xparameters.h, and see.. let’s search for this name maybe I can find it.. OK so here I can see that the defines that this application is looking for have changed names in this version of Vivado (SDK), so all I want to do is take the new names and modify my code with the new names, so this is the MM2S. I’ll change that. and then get the other one, S2MM. And change that one. Then save the file. It builds automatically and I can see that now my, now the SDK can, knows, the interrupt vector IDs. So the application is built, I just have to run it, so I’m again going to say I’m going to click on the application. Click on Run configurations. Double click on system debugger. And then run. OK and when I run that, go back to my terminal window, to see the output. And I see that it says “successfully ran the AXI DMA scatter gather interrupt example” so, that’s my two examples working. At this point I guess I leave you guys to muck around with the example applications, see what you can learn from the code. So thanks for watching and good luck with your projects.

↧

Getting Started with the MYIR Z-turn

October 17, 2017, 6:03 pm

≫ Next: Creating a custom AXI-Streaming IP in Vivado

≪ Previous: Using AXI DMA in Vivado Reloaded

In this video I create a simple Vivado design for the MYIR Z-turn Zynq SoM and we run a hello world application on it, followed by the lwIP echo server. We connect the Z-turn to a network, then we use “ping” and “telnet” to test the echo server from a PC that is connected to the same network.

If you want to try it out yourself, download the SD card boot files here:

The SoM

The Z-turn stands out in the market of Zynq based SoMs because it’s got a few features that the others don’t; of most interest to me being the accelerometer and the HDMI interface. My first impressions of the board were good, it has a clean look, it’s compact and it has most features I’d normally be looking for. But there was one thing I didn’t like: the JTAG header. They’ve chosen the big 100mil pitch header for the old Platform Cable USB. Most Xilinx dev boards nowadays have a smaller JTAG header, so none of my JTAG programmers can actually plug into this. Anyway, if I do anything serious with this board I’ll definitely have to wire up an adapter.

Z-turn JTAG

Board files

Another little issue I found was with the support. I couldn’t find the board preset files anywhere on the MYIR website, nor on the CD that comes with the board. So in the end I found Sergiusz Bazanski’s Github repo and his own hand-coded board files for the Z-turn:

https://github.com/q3k/zturn-stuff

You’ll need to install those board files before going through the example. Thanks Sergiusz!

USB-UART trap

Something I emphasized in the video and I want to re-iterate here; the USB-UART on the Z-turn is connected to the PS UART1 (ONE) peripheral. That’s important to know because the PS UART0 (ZERO) peripheral is also enabled by Sergiusz’ board preset, and it’s this peripheral that the SDK will choose by default for STDIO. This means that when you create a BSP in the SDK, it will select the PS UART0 for your STDIO – not your USB-UART. So you have to manually change it, or you can expect nothing to come up on your UART console window.

Ethernet PHY issue

When trying to get the lwIP echo server running, be aware that the Z-turn has an AR8035 Atheros Ethernet PHY. The lwIP driver doesn’t contain code for properly configuring that PHY, instead it’s designed for TI and Marvell PHYs. In this video, I show you how to modify the lwIP driver so that it does properly configure the PHY. Here is the code snippet for that:

	// Enable RGMII TX clock delay in the AD8035 PHY
	XEmacPs_PhyWrite(xemacpsp,phy_addr, 0x1D, 0x05);
	XEmacPs_PhyWrite(xemacpsp,phy_addr, 0x1E, 0x0100);
	// Enable RGMII RX clock delay in the AD8035 PHY
	XEmacPs_PhyWrite(xemacpsp,phy_addr, 0x1D, 0x0);
	XEmacPs_PhyWrite(xemacpsp,phy_addr, 0x1E, 0x8000);

Here is the name of the file that needs to be modified:

\echo_server_bsp\ps7_cortexa9_0\libsrc\lwip141_v1_9\src\contrib\ports\xilinx\netif\xemacpsif_physpeed.c

As you can see from the code, the main issue is the configuration of the RGMII TX and RX clock delays. The Zynq GEM expects both of those delays to be enabled in the PHY. The lwIP code actually tries to enable those delays, but it’s writing to the wrong registers because it’s expecting a Marvell PHY, not an Atheros PHY. If we don’t use the above code then we get bad timing on the RGMII interface and the echo server wont work.

Great hardware, lacks support

Overall, I like the board but the support you find online is limited. The price is great, so if you particularly need a Zynq SoM with HDMI, then yes I’d recommend this board.

↧

Creating a custom AXI-Streaming IP in Vivado

November 1, 2017, 7:47 am

≫ Next: Artix-7 Arty Base Project

≪ Previous: Getting Started with the MYIR Z-turn

The AXI-Streaming interface is important for designs that need to process a stream of data, such as samples coming from an ADC, or images coming from a camera. In this tutorial, we go through the steps to create a custom IP in Vivado with both a slave and master AXI-Streaming interface. The custom IP will be written in Verilog and it will simply buffer the incoming data at the slave interface and make it available at the master interface – in other words, it will be a FIFO. We’ll test the custom IP using a DMA which we’ll use to push streaming data into the IP and pull data out of the IP. We’ll use an SDK application to setup these DMA transfers and compare the sent data with the received data. The hardware we use for testing this will be the MicroZed 7010, so this is a Zynq-7000 design.

The above image is a basic block diagram of our Vivado design, it shows how the DMA connects to the Zynq Processing System, and also how the custom IP connects to the AXI-Streaming interfaces of the DMA. If you are not familiar with the DMA IP, you should checkout this tutorial on using the DMA.

Source code for the custom IP

The Verilog code for our custom IP is based on an asynchronous AXI-Streaming FIFO written by Alex Forencich. You can find the original code on his Github repo, as well as a bunch of other useful modules. I’ve had to slightly modify the code for this project and you’ll be able to copy and paste it from below:

/*

Copyright (c) 2014-2017 Alex Forencich

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

*/

/*

Modified by Jeff Johnson http://www.fpgadeveloper.com

- Renamed ports to match Vivado's naming for AXI-Streaming slave and master
- Removed the async reset input to the module
- Added separate resets for slave and master interfaces
- Removed the tuser signals (not used by Vivado)

*/

// Language: Verilog 2001

`timescale 1ns / 1ps

/*
 * AXI4-Stream asynchronous FIFO
 */
module axis_fifo_v1_0 #
(
    parameter ADDR_WIDTH = 12,
    parameter C_AXIS_TDATA_WIDTH = 32
)
(
    /*
     * AXI slave interface (input to the FIFO)
     */
    input  wire                   s00_axis_aclk,
    input  wire                   s00_axis_aresetn,
    input  wire [C_AXIS_TDATA_WIDTH-1:0]  s00_axis_tdata,
    input  wire [(C_AXIS_TDATA_WIDTH/8)-1 : 0] s00_axis_tstrb,
    input  wire                   s00_axis_tvalid,
    output wire                   s00_axis_tready,
    input  wire                   s00_axis_tlast,
    
    /*
     * AXI master interface (output of the FIFO)
     */
    input  wire                   m00_axis_aclk,
    input  wire                   m00_axis_aresetn,
    output wire [C_AXIS_TDATA_WIDTH-1:0]  m00_axis_tdata,
    output wire [(C_AXIS_TDATA_WIDTH/8)-1 : 0] m00_axis_tstrb,
    output wire                   m00_axis_tvalid,
    input  wire                   m00_axis_tready,
    output wire                   m00_axis_tlast
);

reg [ADDR_WIDTH:0] wr_ptr_reg = {ADDR_WIDTH+1{1'b0}}, wr_ptr_next;
reg [ADDR_WIDTH:0] wr_ptr_gray_reg = {ADDR_WIDTH+1{1'b0}}, wr_ptr_gray_next;
reg [ADDR_WIDTH:0] wr_addr_reg = {ADDR_WIDTH+1{1'b0}};
reg [ADDR_WIDTH:0] rd_ptr_reg = {ADDR_WIDTH+1{1'b0}}, rd_ptr_next;
reg [ADDR_WIDTH:0] rd_ptr_gray_reg = {ADDR_WIDTH+1{1'b0}}, rd_ptr_gray_next;
reg [ADDR_WIDTH:0] rd_addr_reg = {ADDR_WIDTH+1{1'b0}};

reg [ADDR_WIDTH:0] wr_ptr_gray_sync1_reg = {ADDR_WIDTH+1{1'b0}};
reg [ADDR_WIDTH:0] wr_ptr_gray_sync2_reg = {ADDR_WIDTH+1{1'b0}};
reg [ADDR_WIDTH:0] rd_ptr_gray_sync1_reg = {ADDR_WIDTH+1{1'b0}};
reg [ADDR_WIDTH:0] rd_ptr_gray_sync2_reg = {ADDR_WIDTH+1{1'b0}};

reg s00_rst_sync1_reg = 1'b1;
reg s00_rst_sync2_reg = 1'b1;
reg s00_rst_sync3_reg = 1'b1;
reg m00_rst_sync1_reg = 1'b1;
reg m00_rst_sync2_reg = 1'b1;
reg m00_rst_sync3_reg = 1'b1;

reg [C_AXIS_TDATA_WIDTH+2-1:0] mem[(2**ADDR_WIDTH)-1:0];
reg [C_AXIS_TDATA_WIDTH+2-1:0] mem_read_data_reg = {C_AXIS_TDATA_WIDTH+2{1'b0}};
reg mem_read_data_valid_reg = 1'b0, mem_read_data_valid_next;
wire [C_AXIS_TDATA_WIDTH+2-1:0] mem_write_data;

reg [C_AXIS_TDATA_WIDTH+2-1:0] m00_data_reg = {C_AXIS_TDATA_WIDTH+2{1'b0}};

reg m00_axis_tvalid_reg = 1'b0, m00_axis_tvalid_next;

// full when first TWO MSBs do NOT match, but rest matches
// (gray code equivalent of first MSB different but rest same)
wire full = ((wr_ptr_gray_reg[ADDR_WIDTH] != rd_ptr_gray_sync2_reg[ADDR_WIDTH]) &&
             (wr_ptr_gray_reg[ADDR_WIDTH-1] != rd_ptr_gray_sync2_reg[ADDR_WIDTH-1]) &&
             (wr_ptr_gray_reg[ADDR_WIDTH-2:0] == rd_ptr_gray_sync2_reg[ADDR_WIDTH-2:0]));
// empty when pointers match exactly
wire empty = rd_ptr_gray_reg == wr_ptr_gray_sync2_reg;

// control signals
reg write;
reg read;
reg store_output;

assign s00_axis_tready = ~full & ~s00_rst_sync3_reg;

assign m00_axis_tvalid = m00_axis_tvalid_reg;

assign mem_write_data = {s00_axis_tlast, s00_axis_tdata};
assign {m00_axis_tlast, m00_axis_tdata} = m00_data_reg;

// reset synchronization
always @(posedge s00_axis_aclk) begin
    if (!s00_axis_aresetn) begin
        s00_rst_sync1_reg <= 1'b1;
        s00_rst_sync2_reg <= 1'b1;
        s00_rst_sync3_reg <= 1'b1;
    end else begin
        s00_rst_sync1_reg <= 1'b0;
        s00_rst_sync2_reg <= s00_rst_sync1_reg | m00_rst_sync1_reg;
        s00_rst_sync3_reg <= s00_rst_sync2_reg;
    end
end

always @(posedge m00_axis_aclk) begin
    if (!m00_axis_aresetn) begin
        m00_rst_sync1_reg <= 1'b1;
        m00_rst_sync2_reg <= 1'b1;
        m00_rst_sync3_reg <= 1'b1;
    end else begin
        m00_rst_sync1_reg <= 1'b0;
        m00_rst_sync2_reg <= s00_rst_sync1_reg | m00_rst_sync1_reg;
        m00_rst_sync3_reg <= m00_rst_sync2_reg;
    end
end

// Write logic
always @* begin
    write = 1'b0;

    wr_ptr_next = wr_ptr_reg;
    wr_ptr_gray_next = wr_ptr_gray_reg;

    if (s00_axis_tvalid) begin
        // input data valid
        if (~full) begin
            // not full, perform write
            write = 1'b1;
            wr_ptr_next = wr_ptr_reg + 1;
            wr_ptr_gray_next = wr_ptr_next ^ (wr_ptr_next >> 1);
        end
    end
end

always @(posedge s00_axis_aclk) begin
    if (s00_rst_sync3_reg) begin
        wr_ptr_reg <= {ADDR_WIDTH+1{1'b0}};
        wr_ptr_gray_reg <= {ADDR_WIDTH+1{1'b0}};
    end else begin
        wr_ptr_reg <= wr_ptr_next;
        wr_ptr_gray_reg <= wr_ptr_gray_next;
    end

    wr_addr_reg <= wr_ptr_next;

    if (write) begin
        mem[wr_addr_reg[ADDR_WIDTH-1:0]] <= mem_write_data;
    end
end

// pointer synchronization
always @(posedge s00_axis_aclk) begin
    if (s00_rst_sync3_reg) begin
        rd_ptr_gray_sync1_reg <= {ADDR_WIDTH+1{1'b0}};
        rd_ptr_gray_sync2_reg <= {ADDR_WIDTH+1{1'b0}};
    end else begin
        rd_ptr_gray_sync1_reg <= rd_ptr_gray_reg;
        rd_ptr_gray_sync2_reg <= rd_ptr_gray_sync1_reg;
    end
end

always @(posedge m00_axis_aclk) begin
    if (m00_rst_sync3_reg) begin
        wr_ptr_gray_sync1_reg <= {ADDR_WIDTH+1{1'b0}};
        wr_ptr_gray_sync2_reg <= {ADDR_WIDTH+1{1'b0}};
    end else begin
        wr_ptr_gray_sync1_reg <= wr_ptr_gray_reg;
        wr_ptr_gray_sync2_reg <= wr_ptr_gray_sync1_reg;
    end
end

// Read logic
always @* begin
    read = 1'b0;

    rd_ptr_next = rd_ptr_reg;
    rd_ptr_gray_next = rd_ptr_gray_reg;

    mem_read_data_valid_next = mem_read_data_valid_reg;

    if (store_output | ~mem_read_data_valid_reg) begin
        // output data not valid OR currently being transferred
        if (~empty) begin
            // not empty, perform read
            read = 1'b1;
            mem_read_data_valid_next = 1'b1;
            rd_ptr_next = rd_ptr_reg + 1;
            rd_ptr_gray_next = rd_ptr_next ^ (rd_ptr_next >> 1);
        end else begin
            // empty, invalidate
            mem_read_data_valid_next = 1'b0;
        end
    end
end

always @(posedge m00_axis_aclk) begin
    if (m00_rst_sync3_reg) begin
        rd_ptr_reg <= {ADDR_WIDTH+1{1'b0}};
        rd_ptr_gray_reg <= {ADDR_WIDTH+1{1'b0}};
        mem_read_data_valid_reg <= 1'b0;
    end else begin
        rd_ptr_reg <= rd_ptr_next;
        rd_ptr_gray_reg <= rd_ptr_gray_next;
        mem_read_data_valid_reg <= mem_read_data_valid_next;
    end

    rd_addr_reg <= rd_ptr_next;

    if (read) begin
        mem_read_data_reg <= mem[rd_addr_reg[ADDR_WIDTH-1:0]];
    end
end

// Output register
always @* begin
    store_output = 1'b0;

    m00_axis_tvalid_next = m00_axis_tvalid_reg;

    if (m00_axis_tready | ~m00_axis_tvalid) begin
        store_output = 1'b1;
        m00_axis_tvalid_next = mem_read_data_valid_reg;
    end
end

always @(posedge m00_axis_aclk) begin
    if (m00_rst_sync3_reg) begin
        m00_axis_tvalid_reg <= 1'b0;
    end else begin
        m00_axis_tvalid_reg <= m00_axis_tvalid_next;
    end

    if (store_output) begin
        m00_data_reg <= mem_read_data_reg;
    end
end

endmodule

Remember, when you create the custom IP, Vivado will auto-generate a top level wrapper (filename is axis_fifo_v1_0.v) and some code to drive the slave and master AXI-Streaming interfaces. You’ll have to paste the above code over the top module source code (axis_fifo_v1_0.v) of the auto-generated IP. The other two auto-generated source files can be left as they are – they will be removed from the hierarchy as soon as you replace and save the top module code, because they will no longer be instantiated by the top module.

MicroZed Board Preset issue

When building our Vivado design, just after generating a HDL wrapper for the block design, you will see some critical warnings related to timing of the DDR interface. These critical warnings can be ignored and they are related to some values in the board files. See this forum post for more information:

https://forums.xilinx.com/t5/Design-Entry/Vivado-critical-warning-when-creating-hardware-wrapper/td-p/762938

The test application for SDK

We test the custom IP by making the DMA push data through the AXI-Streaming slave interface and to pull data out of the AXI-Streaming master interface of our custom IP. The application we will use for this is one of the example applications for the DMA that can be found in the Xilinx SDK installation files. You will find it on this path:

C:\Xilinx\SDK\2017.3\data\embeddedsw\XilinxProcessorIPLib\drivers\axidma_v9_4\examples

In this tutorial, we use the scatter gather poll example (xaxidma_example_sg_poll.c), but as we hooked up the interrupts in the Vivado design, we could have also used the interrupt based one (xaxidma_example_sg_intr.c).

What to try

Once you’ve gotten this working, I suggest you try modifying the test application in the SDK to print out what is actually being sent and received. You could then modify your Verilog code to do some kind of manipulation of the incoming data, rebuild everything and verify with your test application that the data coming out is what you expected. Another useful thing to do when building custom IP blocks like this is to write a test bench and simulate the custom IP, this will be the topic of a future tutorial.

↧

Artix-7 Arty Base Project

November 7, 2017, 6:15 pm

≫ Next: PetaLinux for Artix-7 Arty Base Project

≪ Previous: Creating a custom AXI-Streaming IP in Vivado

Here’s a base project for the Arty board based on the Artix-7 FPGA. The Arty is a nice little dev board because it’s low cost ($99 USD) but it’s still got enough power and connectivity to make it very useful. I really like the fact that the JTAG and UART are both accessed through the same USB connector, so I only need to connect one USB cable. I also like the fact that I can power it from the USB connector alone – provided I don’t connect too many power hungry PMods or an Arduino shield.

In this project, we leverage the Arty’s board files and Vivado’s automation features to quickly put together a base design to exploit most of the hardware on the board. Then in the second video, we shift to the Xilinx SDK and test our design on hardware by running a “hello world” application and then the lwIP echo server application. In future Microblaze tutorials we’ll build on this design.

Board files

Before you can run through this tutorial, you’ll need to install the Arty’s board files to your Vivado installation. You can download the board files here, and follow Digilent’s instructions for installing them.

Clocking

The Arty has an on-board oscillator to generate a 100MHz clock. We need to feed this clock into a Clock Wizard to generate three clocks: two for the MIG (DDR) and one for the Ethernet PHY.

166.667MHz: For the MIG’s sys_clk_i input
200MHz: For the MIG’s clk_ref_i input
25MHz: For the Ethernet PHY reference clock

The rest of our design will run off the MIG’s ui_clk output (83.333MHz).

Ethernet reference clock

On the Arty schematics, you’ll see that the Ethernet PHY has provisions for a 25MHz crystal to generate it’s own 25MHz reference clock. However the crystal is not loaded on the board – probably to help get that price down to $99! Anyway, for this reason, the FPGA needs to generate and feed a clock to the Ethernet PHY, and this is why we generate the 25MHz from the Clock Wizard. The FPGA pin that connects to the Ethernet reference clock input on the PHY is G18, and we have to provide a LOC constraint for this in our design. Here are the constraints to add to the design for this purpose:

# Arty Ethernet reference clock
set_property IOSTANDARD LVCMOS33 [get_ports eth_ref_clk]
set_property PACKAGE_PIN G18 [get_ports eth_ref_clk]

AXI Timer

I’ve included the AXI Timer IP in the base design, because it’s needed by the lwIP echo server application AND PetaLinux. We’ll build PetaLinux for the Arty in a future tutorial.

UART settings

To read Arty’s console output, you’ll have to use a UART console such as Putty and connect to the comport that your Arty chooses when you plug it in to the PC. To find the right comport, just go into the Device manager after connecting the Arty to your PC via USB. Once you’ve got that, just remember to use a baud rate of 9600 and you’ll be in business.

↧

PetaLinux for Artix-7 Arty Base Project

November 15, 2017, 7:58 am

≫ Next: Create a custom PYNQ overlay for PYNQ-Z1

≪ Previous: Artix-7 Arty Base Project

In the final part of the Arty base project tutorial, we build a PetaLinux project that’s tailored to our Arty base design. Then we boot PetaLinux on our hardware and verify that we have network connectivity by checking the Arty’s DHCP assigned IP address and then pinging it from a PC.

Tools used

I used the following setup to do this project:

Vivado 2017.3 on a Windows 10 machine
PetaLinux 2017.3 on a Ubuntu 16.04 LTS machine

Vivado project modifications

Before we get started with PetaLinux, we have to make sure that our Vivado design satisfies the minimum requirements for running PetaLinux:

Microblaze must use configuration “Linux with MMU” or “Low-end Linux with MMU“
At least 32MB of external memory
Dual channel timer with interrupt connected
UART IP with interrupt connected
Ethernet IP with interrupt connected

Our original base design satisfies all but one of those requirements – the first one. So the first thing we have to do in this tutorial is to select the “Linux with MMU” configuration for the Microblaze. The next thing we do is to enable the GPIO interrupts and connect them through to the Microblaze – this isn’t a requirement, but it’s useful. We then have to save our block design, re-generate a bitstream for the project and export it.

PetaLinux tool commands

To build the PetaLinux project, we transfer our entire Vivado project to a Linux machine with the PetaLinux tools installed. These are the PetaLinux tool commands that we use in the tutorial, in the order that we use them:

  # Launch PetaLinux tools (note that you'll have to specify your own PetaLinux install path)
  source ./PetaLinux-2017-3/settings.sh
  # Cd to the working directory (where the arty_base directory has been copied to)
  cd /media/opsero/arty
  # Create the PetaLinux project, using the "microblaze" template
  petalinux-create --type project --template microblaze --name arty_petalinux
  # Cd to the PetaLinux project
  cd arty_petalinux
  # Import the hardware description into our PetaLinux project
  petalinux-config --get-hw-description ../arty_base/arty_base.sdk --oldconfig
  # Optional: Configure the kernel
  petalinux-config -c kernel
  # Optional: Configure the root filesystem
  petalinux-config -c rootfs
  # Build the PetaLinux project
  petalinux-build

Kernel configuration

The Linux driver for the AXI Ethernetlite IP requires certain drivers to be enabled in the PetaLinux kernel. Fortunately, the PetaLinux tools are pretty good at enabling the drivers for the IP that it finds in your exported Vivado design. So the required drivers are already enabled and we don’t have to run the kernel configuration (petalinux-config -c kernel), but for completeness, here is a list of the required kernel configurations:

CONFIG_ETHERNET
CONFIG_NET_VENDOR_XILINX
CONFIG_XILINX_EMACLITE

Device tree modification

We have to make an addition to the device tree in order to specify the Ethernet PHY’s address with respect to the MDIO bus. This address depends on how the PHY is physically wired, for any particular board it is usually mentioned in the user guide or if not we can usually figure it out from the schematics. In the case of the Arty, the PHY address is 1 (one) and we need to specify this in the device tree so that the Ethernet driver can communicate with the PHY. Below is the device tree code that we need to add to the system-user.dtsi file.

arty_petalinux/project-spec/meta-user/recipes-bsp/device-tree/files/system-user.dtsi

&axi_ethernetlite_0 {
  local-mac-address = [00 0a 35 00 01 22];
  phy-handle = <&phy0>;
  xlnx,has-mdio = <0x1>;
  mdio {
    #address-cells = <1>;
    #size-cells = <0>;
    phy0: phy@1 {
      device_type = "ethernet-phy";
      reg = <1>;
  };
};

Launching it on the Arty

Once the PetaLinux project is built, we then launch the Putty UART console and program the FPGA with bitstream and kernel. Here are the commands we used:

  # Launch Putty, the UART console
  sudo putty &
  # Program the FPGA with the bitstream
  petalinux-boot --jtag --fpga
  # Load the kernel into memory and run it
  petalinux-boot --jtag --kernel

How to package the PetaLinux project

It’s useful to be able to program the flash with our bitstream and Linux kernel so that it boots up automatically when we power up the board. To be able to do this, we need to package the PetaLinux project and generate a .mcs file. We don’t go through this in the video, but if you’re interested, here’s how to do it:

In the Linux command terminal, type:

petalinux-config

In the menu, enable the following option:

Subsystem AUTO Hardware Settings->Advanced bootable images storage Settings

Set the flash partition sizes as follows:

Subsystem AUTO Hardware Settings->Flash Settings

fpga partition size    0x300000

boot partition size    0x100000

bootenv partition size 0x100000

kernel partition size  0xA40000

Build the PetaLinux project:

petalinux-build

Package the PetaLinux project:

petalinux-package --boot --force --fpga ../arty_base/arty_base.runs/impl_1/design_1_wrapper.bit --u-boot --kernel --flash-size 16 --flash-intf SPIx1

You’ll find the boot.mcs file under arty_petalinux/images/linux.

Try it yourself

If you want to run this project on your Arty board, just download the boot files that I’ve provided here: Arty PetaLinux boot files

JTAG instructions

In the compressed file, you’ll find a bitstream and .elf file (the PetaLinux kernel) that can be downloaded to your Arty via JTAG using the XMD tool. Launch XMD and type these commands:

  
  fpga -f design_1_wrapper.bit
  connect mb mdm
  dow image.elf
  run

Flash instructions

Also in the compressed file, you’ll find a .mcs file that you can program into the Arty’s flash memory so that PetaLinux boots up every time you power up the board. To program the Arty’s flash memory:

launch the Hardware Manager in Vivado
make a connection with the FPGA
add configuration memory device “n25q128-3.3v-spi-x1_x2_x4“
program the configuration memory device with the .mcs file

Digilent has a good tutorial on this here: Programming the Arty using Quad SPI Flash

Make sure to open a UART terminal for a baud rate of 9600, so that you don’t miss the boot log. Also, remember to connect the Arty to your network router so that the IP address gets automatically assigned during the boot sequence.

↧

Create a custom PYNQ overlay for PYNQ-Z1

March 15, 2018, 12:31 pm

≪ Previous: PetaLinux for Artix-7 Arty Base Project

In this video tutorial we create a custom PYNQ overlay for the PYNQ-Z1 board. Probably the simplest PYNQ overlay possible, it contains one custom IP (an adder) with an AXI-Lite interface and three registers accessible over that interface: a, b and c. To use the IP we write a number to input registers a and b, and then we read the output register c which contains the sum of a and b. We create the IP in Vivado HLS, we then create the overlay in Vivado and bring the IP into our block design. Then we copy the overlay files (.bit and .tcl) over the network and onto the SD card of the PYNQ-Z1 board. Finally we open the Jupyter web application from a web browser and we write some Python code to test out our overlay and custom IP.

The tutorial is based on the one in the PYNQ online documentation here: PYNQ Overlay Tutorial. I suggest you refer to the tutorial for the code blocks but I recommend that you also read through it (before or after the video) to allow the concepts to sink in.

In this tutorial, the custom IP doesn’t make a good accelerator but it’s useful for demonstrating some of the basic PYNQ concepts such as the Overlay and DefaultIP drivers. In the next tutorial we’ll design an IP that can process a block of data, and hook it up to a DMA so that we can demonstrate real speed gains by offloading work to the FPGA.

↧