Mockspec by jrvliet

Welcome to Mockspec

This is the main page for the mockspec code. This code generates and analyzes synthetic quasar absorption line profiles in the circumgalactic medium around galaxies simulated with ART. It has been developed by Jacob Vander Vliet and Dr. Chris Churchill at New Mexico State University. The full details of how the code works can be found: here and here.

Overview:

This code generates synthetic quasar absorption lines by running lines of sight (LOS) through a simulated galaxy made with ART. This repository has all the code required to generate and analyze the spectra.

Before you run the code:

There are a few steps that need to be taken before the code can be run.

To run this code, the following python packages need to be installed:
- numpy
- pytables
- panda
- astropy
Get the simulation in the correct format. ART output the entire simulation box, typically 10Mpc along a side. Mockspec does not run on these boxes. It instead runs on a small box selected from the large box. This selection is done using ANA. There are two versions of ANA. The following instructions are for the version written by Daniel Ceverino. Usually, you should use the second version. Daniel's version is found in funcs/ana/. Running ANA requires the following simulation files:
- 10MpcBox_csf512.d
- 10MpcBox_HartGal_csf_a0.300.d
- 10MpcBox_HartGal_csf.d
- PMcrda0.300.DAT
- PMcrs0a0.300.DAT
- sf.dat
To control the run of ANA, there are two control file:
- control.dat
- schedule_R.dat
Schedule_R.dat has most of the controls in it. The top section has flags as to what the code should read in and what files it should produce. The sections that definitely need to have a 1 are:
- Read N-body file
- Read stellar file
- Read HYDRO file
- Set units and global variables
- Find halo center using particle distribution
- Compute angular momentum
- density, temperature, entropy profiles and Rvir calculation
- Binary file with gas cells inside 4Rvir-box
- Binary file with gas cells inside 4Rvir-box (metalliciites)
ANA has an interesting behaviour where it finds the center of the galaxy, but is super bad at it. You need to find the center with Rockstar (found here and put it in manually. The Rockstar center is found in the halos* files and is the first halo listed. These files contain ALL the halos in the simulation, ordered by mass. Since we want the host galaxy, use the first line.

The coordinates are put in on the lines "xuser", "yuser", "zuser". The flag in front of "ioptCenter=1/2/3-->potential/HF/IFRIT" should be set to 2, indicating the center is from a halo finder if the simulation is a VELA, or 1 (indicating the center is the location of maximum potential) if the simulation is a dwarf. This setting is the biggest source of uncertainty in ANA. Always check the output Mvir and Rvir against the value from Rockstar, which is recorded along with the centers. If they differ by a significant amount, change the setting of ioptcenter and try again.

Line 30 of schedule_R.dat should be the single word 'test'. This is the prefix appended to the front of all files created by ANA. Change this to the galID, such as D9m4a or vela20v2.

The rest of the file should not be touched.

To setup control.dat, only change after the three repeated lines of "1. 129". The single number on the next line is the number of snapshots to run ANA on. Leave this as one. The rest of the file is the list of expansion parameters of the snapshots. Even if there are several numbers listed here, ANA will only run on the first one, since the number of snapshots to run is set to 1. Also, ensure the first line, labeled jobname1, is set to 10MpcBox_HartGal_csf. This is the large 10Mpc box that will be read.

Once both control.dat and schedule_R.dat are set, run ana.R.exe. The file needed is the GZ file. The necessary output files are named <galID>_GZa<expansion parameter>.txt (the gas box) and summary.txt (summary file).
To use the summary file, its contents need to be added to the galaxy's summary file. These files are named <galID>.dat and are located in the summaries directory. These properties are required for the rest of the pipeline, mainly cellfinder, to work. The rotation matrix a is one of the most important results from ANA and the main reason we still use it.
The pipeline does not run on the binary files. It runs on an ASCII version of the file. To convert, use readbinary.exe. Run it with the format of
readbinary.exe <binary file> <name of output ASCII file> GZ
The name of the output ASCII file should be the name of the binary file, but with .dat replaced with .txt
In most cases, Ceverino's version of ANA is not the version you should use. You should use the version written by Kenza Arraki. It is found in ana/justGZ/. Depending on the type of simulation you are analyzing, you may have to change some of the code. The line to look it is line 294 of ART_IO.F. It should look like:
integer lspecies(nspec)

If you are analyzing a Vela of the 4th generation, this needs to be changed to:

integer*8 lspecies(nspec)

The other change is to use the correct a_setup.h file. The justGZ directory has three available: Dwarf_a_setup.h, VELA_a_setup.h, VELA4_a_setup.h. Copy the one you are using to a_setup.h and make the code.

To use this version, the directory has to be set up properly. There needs to be a directory called outputs in the location where you are running the code. This this directory there needs to be a directory called rockstar. There also needs to be find called input_.txt. This file is output by the Rockstar code.
When the code is run, it will output the GZ box in ASCII into the output directory. It will be called _GZa.txt. There will also be a file called ioread.txt which is the screen output of ANA and a file called rotmat_a.txt. This contains the rotation matrix for the box.
Configure the run. The parameters of the run are set in the mockspec.config file. There are three sections in this file. The first describes the galaxy and basic code parameters. These are mostly self-explanatory. The number of cores only affects the codes that can be run with parallization, namely cellfinder. The root location is not used by anything, so don't change it The second section determines which codes will be run. A 1 turns the code on, a 0 turns it off. The third section describes the ions to be probed and what instrument will be used for that ion.
Ensure the directory has all required files, namely the galaxy file
Make all compiled codes. This includes: rates, cellfinder, los7, specsynth, sysabs, and cullabs. To make rates, you need to download and untar the following files:

Running the code

The main driver is mockspec.py. Run this in the directory with the snapshot. Everything is controlled by the mockspec.config file.

Outputs

The code generates several files along the way:

los<losnum>.cellID.dat: Generated by cellfinder, this is a list of the cellIDs of cells along the LOS
<galID>.<ion>.los<losnum>.dat: Generated by idcells, this contains all properties of the cells along the LOS as taken from the simulation output.
<galID>.<ion>.los<losnum>.losdata: Generated by los7, this contains derived properties of each cell along the LOS.
<galID>.<ion>.los<losnum>.lines: Generated by los7, this contains the column density and doppler parameter for each cell only the LOS.
<galID>.<ion>.los<losnum>.<ion><transition>.spec: Generated by specsynth, this contains the spectrum of the absorption. If the ion is a doublet, two of these files are created, one for each transition.
<galID>.<ion>.los<losnum>.sysabs: Generated by sysanal, this contains the equivalent width and AOD column density of each transition. This file is only generated if absorption is detected.
.<galID>.<ion>.los<losnum>regabs: Generated by sysanal, this file is only generated if there are multiple absorption regions detected.
<galID>.<ion>.ALL.sysabs: Generated by cullabs, this is all sysabs files culminated into one file.

Reading the Outputs

Most of the outputs are stored in the HDF5 file format. This format results in smaller output files that write and read faster than plain text and is the industry standard. To access the files, it is helpful to know the columns included in each file:

abs_cells:

LOS: Line of sight number
D: Impact parameter
cellID: Identification number of the cell
redshift: Redshift of the cell, based on the redshift of the galaxy and the cell's line-of-sight velocity
logN: Column density of the ion in the cell
dobbler_b: Doppler b parameter = sqrt(2kT/m)
x: x-coordinate of the cell in the box's coordinate system in [kpc]
y: y-coordinate of the cell in the box's coordinate system in [kpc]
z: z-coordinate of the cell in the box's coordinate system in [kpc]
vx: vx-component of the cell's velocity in the box's coordinate system in [km/s]
vy: vy-component of the cell's velocity in the box's coordinate system in [km/s]
vz: vz-component of the cell's velocity in the box's coordinate system in [km/s]
galactocentric_d: Distance between the cell and the center of the galaxy [kpc]
log_nH: Log of the gas density [num/cc] of the cell
log_T: Log of the temperature [K] of the cell
cell_size: Size of the cell [kpc]
SNII: Contribution to the metal mass fraction from SNII
SNIa: Contribution to the metal mass fraction from SNIa
alpha_Zmet: Metallicity of the cell in standard Z style
ion_density: Density of the ion in the cell [num/cc]

ALL.sysabs:

los: Line of sight number
D: Impact parameter [kpc]
zabs: Redshift of the absorption feature
v-: Lower edge of the absorption feature in velocity space
v+: Upper edge of the absorption feature in velocity space
EW_r: Rest frame equivalent width of the absorption feature
dEW_r: Uncertainty in EW_r
DR:
dDR: Uncertainty in DR
SL: Significance level of the detection
Vbar: Mean velocity of the absorption
dVbar: Uncertainty in Vbar
Vsprd: Spread of the absorption in velocity space
dVsprd: Uncertainty in Vsprd
Vasym:
dVasym: Uncertainty in Vasym
lgt:
dtau-: Lower uncertainty in optical depth
dtau+: Upper uncertainty in optical depth
logN: Log of column density of the absorption
dlogN-: Lower uncertainty in logN
dlogN-: Upper uncertainty in logN

Gasbox:

cell_size: Size of the cell [kpc]
x: x-coordinate of the cell in box reference frame [kpc]
y: y-coordinate of the cell in box reference frame [kpc]
z: z-coordinate of the cell in box reference frame [kpc]
vx: vx component of the cell's velocity in box reference frame [km/s]
vy: vy component of the cell's velocity in box reference frame [km/s]
vz: vz component of the cell's velocity in box reference frame [km/s]
nH: Gas density in the cell [num/cc]
temperature: Temperature of gas in the cell [K]
SNII: Contribution to the metal mass fraction from SNII
SNIa: Contribution to the metal mass fraction from SNIa
nAtom: Density of the element in question (such as carbon or oxygen) in the cell [num/cc]
fIon: Ionization fraction of the ion in question within the cell
nIon: Density of the ion in the cell [num/cc]
alpha_sol: Solar abundance of the element
alpha_Zmet: Abundance of the element in the cell relative to solar
ID: Identification number of the cell
t_ph: Timescale of photoionization in the cell
t_rec: Timescale of recombination in the cell
t_coll: Timescale of collisional ionization in the cell
t_cool: Timescale of cooling in the cell