Documentation for MatrixTDB tables
Some brief notes about the tables in MatrixTDB2, what they are for, and what is stored in each of the columns.
Note that we discuss MatrixTDB as the system, but the current database on the MySQL server on capuchin being used by the system is MatrixTDB2, not MatrixTDB. MatrixTDB (the database) still exists, but is no longer being used.
Overview of tables
First, what they are for, in alphabetical order:
extr_source_profile | not used in MatrixTDB2 |
extract_profile_run | not used in MatrixTDB2 |
extract_result | not used in MatrixTDB2 |
extract_results | not used in MatrixTDB2 |
feat_grp | Puts combinations of features that are used in filters together. Every feature/value combination found in filters and imported choices files (language types) is entered as a singleton group (group with just one feature/value combo). Every ‘and’ group of feature/value combos in filters is entered as a group, i.e. with a common fg_grp_id |
filter | Filter repository whose purpose is to match the names of the filters in s_filters.py to the IDs used as foreign keys in fltr_feat_grp. |
fltr_feat_grp | Intersection table that matches filters up to the feature groups to which they apply. Used in combination with lt_feat_grp to get the filters that apply to a language type |
fltr_mrs | For all filters, lists the mrs tags (aka semantic equivalence classes) that they apply to. |
grammar | not used in MatrixTDB2 |
harv_str | Stores the harvester strings given when importing a source profile |
item | Mirrors the item file in a [incr_tsdb()] profile with the addition of i_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id |
lt | A repository of language types that have been imported, either by importing source profiles or separately |
lt_feat_grp | Intersection table that matches language types up with the feature groups they have. Used in combination with fltr_feat_grp to find filters that are relevant to a language type |
mrs | Stores the mrs semantics for each of the mrs tags imported |
orig_source_profile | Stores import information about the source profiles |
parse | Mirrors the item file in a [incr_tsdb()] profile with the addition of p_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id |
res_fltr | Obsolete intersection table that stores the ‘result’ of applying a universal filter to the ‘result’ in a [incr_tsdb()] profile. Was only being used to track at most the first fail found for a given result, now not being used at all since we run through universal filters before even adding to item_tsdb, parse, result tables |
res_sfltr | Intersection table that stores the ‘result’ of applying a specific filter to the ‘result’ in a [incr_tsdb()] profile. Currently only used to store fails, not passes or does-not-applys |
result | Mirrors the result file in the source profile with the addition of the r_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id and r_esp_id which is NULL |
seed_str | Stores the seed strings which are the harvester strings after having gone through stringmods stored in a canonical form: words, prefixes, suffixes, with all three of those being in alphabetical order |
sp_item | Mirrors the item file in the source profile with the addition of the spi_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id |
sp_parse | Mirrors the parse file in the source profile with the addition of the spp_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. |
sp_result | Mirrors the result file in the source profile with the addition of the spr_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. |
str_lst | Intersection table that maps seed strings to mrs tags |
Documentation of columns
Next, for each table, some documentation of the columns. Note that these are grouped by function.
Tables used to store source profile information
These tables are updated by import_from_itsdb.py and used by add_permutes.py.
orig_source_profile
Stores import information about the source profiles
+----------------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+-------------------+----------------+
| osp_orig_src_prof_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| osp_developer_name | char(24) | YES | | NULL | |
| osp_prod_date | timestamp | NO | | CURRENT_TIMESTAMP | |
| osp_orig_lt_id | int(11) | YES | | NULL | |
| osp_comment | varchar(1000) | NO | | | |
+----------------------+------------------+------+-----+-------------------+----------------+
osp_orig_src_prof_id | Primary key |
osp_developer_name | The developer who imported the source profile |
osp_prod_date | The date the profile was imported |
osp_orig_ld_id | he ID of the language type that produced the profile |
osp_comment | A user-entered description of the source profile |
harv_str
Stores the harvester strings given when importing a source profile. Harvester strings serve a dual purpose: They are parsed with a harvester grammar in order to get the mrs for the semantic equivalence class they represent, and they are also used by the constructor function as the starting point for creating the strings for that equivalence class.
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| hs_id | int(11) | NO | PRI | NULL | auto_increment |
| hs_string | varchar(100) | NO | MUL | | |
| hs_mrs_tag | char(20) | NO | MUL | | |
| hs_init_osp_id | int(11) | NO | | | |
| hs_cur_osp_id | int(11) | NO | MUL | | |
+----------------+--------------+------+-----+---------+----------------+
hs_id | Primary key |
hs_string | The harvester string |
hs_mrs_tag | The mrs tag that labels the sem. equiv. class |
hs_init_osp_id | The initial original source profile this string/tag combo was imported as a part of |
hs_cur_osp_id | The most recent original source profile this string/tag combo was imported as a part of (this is how we update mrss) |
mrs
The filters (and parts of the constructor function) are sensistive to the semantic equivalence class of the items they apply to. We use mrs tags to identify the equivalence classes instead of the actual mrss for two reasons 1) they’re much easier to manipulate (and store) and 2) this should help when we get to irreducibly ambiguous harvester strings.
Because the actual mrs values will change over time, we allow multiple rows in this table per mrs tag. The field mrs_current flags only one row per mrs tag as the one to use in test suites for the current system. Previous mrs as preserved in case we need to create test suites corresponding to an older state of the system.
+-------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------+------+-----+---------+----------------+
| mrs_id | int(11) | NO | PRI | NULL | auto_increment |
| mrs_tag | char(20) | YES | MUL | NULL | |
| mrs_value | varchar(1000) | YES | MUL | NULL | |
| mrs_date | datetime | YES | | NULL | |
| mrs_osp_id | int(11) | YES | | NULL | |
| mrs_current | tinyint(4) | YES | | 1 | |
+-------------+---------------+------+-----+---------+----------------+
mrs_id | Primary key |
mrs_tag | Mrs semantic tag |
mrs_vale | The hairy mrs semantics produced in the source profile by [incr_tsdb()] |
mrs_date | NULL |
mrs_osp_id | The ID of the original source profile the mrs was imported from |
mrs_current | 1 if this value of the tag is the current one |
sp_item
Mirrors the item file in the source profile with the addition of the spi_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. sp_item stands for `source profile: item’ In [incr tsdb()], the item file contains the input items, along with information about their well-formedness, length, and other characteristics.
Here are the columns:
+----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------+------+-----+---------+----------------+
| spi_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| spi_origin | char(20) | YES | | | |
| spi_register | char(20) | YES | | | |
| spi_format | char(20) | YES | | | |
| spi_difficulty | int(4) | YES | | -1 | |
| spi_category | char(20) | YES | | | |
| spi_input | varchar(512) | YES | | | |
| spi_wf | int(4) | YES | | 1 | |
| spi_length | int(4) | YES | | -1 | |
| spi_comment | varchar(256) | YES | | | |
| spi_author | char(50) | YES | | | |
| spi_date | char(10) | YES | | | |
| spi_osp_id | int(11) | NO | | | |
+----------------+------------------+------+-----+---------+----------------+
Most of these are not of interest here (and will likely soon be suppressed). We have meaningful data in the following columns:
spi_id | the item id |
spi_input | the actual input string |
spi_length | the number of words in the input string |
spi_author | the author of the item as recorded by [incr tsdb()] |
spi_osp_id | the ID of the original source profile the item came from (not an itsdb field) |
sp_parse
Mirrors the parse file in the source profile with the addition of the spp_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. sp_parse stands for `source profile: parse’ and corresponds to the [incr tsdb()] parse file. The parse file contains information about the processing of each item in a given test run. For our purposes (constructing gold standard profiles for comparison), most of this data is not useful. Instead, parse serves to link item and result, and we may end up dispensing with it and creating it on the fly on export.
Here are the columns:
+------------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------+----------------+
| spp_parse_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| spp_run_id | int(11) | YES | | -1 | |
| spp_i_id | int(11) | NO | | | |
| spp_readings | int(11) | YES | | -1 | |
| spp_first | int(11) | YES | | -1 | |
| spp_total | int(11) | YES | | -1 | |
| spp_tcpu | int(11) | YES | | -1 | |
| spp_tgc | int(11) | YES | | -1 | |
| spp_treal | int(11) | YES | | -1 | |
| spp_words | int(11) | YES | | -1 | |
| spp_l_stasks | int(11) | YES | | -1 | |
| spp_ctasks | int(11) | YES | | -1 | |
| spp_ftasks | int(11) | YES | | -1 | |
| spp_etasks | int(11) | YES | | -1 | |
| spp_stasks | int(11) | YES | | -1 | |
| spp_aedges | int(11) | YES | | -1 | |
| spp_pedges | int(11) | YES | | -1 | |
| spp_raedges | int(11) | YES | | -1 | |
| spp_rpedges | int(11) | YES | | -1 | |
| spp_unifications | int(11) | YES | | -1 | |
| spp_copies | int(11) | YES | | -1 | |
| spp_conses | int(11) | YES | | -1 | |
| spp_symbols | int(11) | YES | | -1 | |
| spp_others | int(11) | YES | | -1 | |
| spp_gcs | int(11) | YES | | -1 | |
| spp_i_load | int(11) | YES | | -1 | |
| spp_a_load | int(11) | YES | | -1 | |
| spp_date | char(22) | YES | | | |
| spp_error | varchar(1000) | YES | | | |
| spp_comment | varchar(1000) | YES | | | |
| spp_osp_id | int(11) | NO | | | |
+------------------+------------------+------+-----+---------+----------------+
The only ones we are actively using are as follows:
spp_parse_id | id for this row, linked by sp_result |
spp_i_id | pointer to item used as input to the parse |
spp_osp_id | id of original source profile this information came from |
sp_result
Mirrors the result file in the source profile with the addition of the spr_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. sp_result stands for `source profile: result’ and corresponds to the [incr tsdb()] result file. The result file contains information about the result of processing of each item in a given test run, including the derivation and the output MRS. Again, there are many columns we are not using (which are in there initially to mirror the [incr tsdb()] tables). The ones we are using are glossed below:
+----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------+------+-----+---------+----------------+
| spr_parse_id | int(11) | NO | MUL | | |
| spr_result_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| spr_time | int(11) | YES | | -1 | |
| spr_ctasks | int(11) | YES | | -1 | |
| spr_ftasks | int(11) | YES | | -1 | |
| spr_etasks | int(11) | YES | | -1 | |
| spr_stasks | int(11) | YES | | -1 | |
| spr_size | int(11) | YES | | -1 | |
| spr_aedges | int(11) | YES | | -1 | |
| spr_pedges | int(11) | YES | | -1 | |
| spr_derivation | varchar(1000) | YES | | | |
| spr_surface | varchar(1000) | YES | | | |
| spr_tree | varchar(1000) | YES | | | |
| spr_mrs | varchar(1000) | YES | | | |
| spr_flags | varchar(24) | YES | | | |
| spr_wf | tinyint(4) | YES | | 1 | |
| spr_osp_id | int(11) | NO | | | |
+----------------+------------------+------+-----+---------+----------------+
spr_parse_id | pointer to row in sp_parse table and through it, row in sp_item |
spr_result_id | id for this result |
spr_mrs | tag corresponding to actual mrs value to be found in the mrs table; itsdb would have an actual mrs here |
spr_wf | ? |
spr_osp_id | id of original source profile this information came from |
Tables for storing master item list
After we import items, we then run add_permutes to make a master list of derived items on the basis of the imported ones. We store these derived items in the tables item_tsdb, parse, result, which are parallel to sp_item, sp_parse, and sp_result.
Note: add_permutes only adds those items to item_tsdb/parse/result that pass all universal filters.
seed_str
Stores the seed strings which are the harvester strings after having gone through stringmods stored in a canonical form: words, prefixes, suffixes, with all three of those being in alphabetical order.
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| seed_id | int(11) | NO | PRI | NULL | auto_increment |
| seed_str_value | varchar(100) | YES | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
seed_id | Primary key |
seed_str_value | Seed string in normalized form |
str_lst
Intersection table that maps seed strings to mrs tags.
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| sl_id | int(11) | NO | PRI | NULL | auto_increment |
| sl_mrs_tag | char(20) | YES | | NULL | |
| sl_seed_type | char(2) | YES | | NULL | |
| sl_seed_id | int(10) unsigned | YES | | NULL | |
+--------------+------------------+------+-----+---------+----------------+
sl_id | Primary key |
sl_mrs_tag | The semantic mrs tag this seed string belongs to. Can serve as a foreign key to mrs.mrs_tag and result.r_mrs. |
sl_seed_type | intended to indicate seed/harv status; currently NULL |
sl_seed_id | Foreign key to seed_str.seed_id |
item
Mirrors the item file in a [incr_tsdb()] profile with the addition of i_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id.
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| i_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| i_origin | char(20) | YES | | | |
| i_register | char(20) | YES | | | |
| i_format | char(20) | YES | | | |
| i_difficulty | int(4) | YES | | -1 | |
| i_category | char(20) | YES | | | |
| i_input | varchar(512) | YES | MUL | | |
| i_wf | int(4) | YES | | 1 | |
| i_length | int(4) | YES | | -1 | |
| i_comment | varchar(256) | YES | | | |
| i_author | char(50) | YES | | | |
| i_date | char(10) | YES | | | |
| i_osp_id | int(11) | NO | | | |
+--------------+------------------+------+-----+---------+----------------+
i_id | Primary key |
i_input | string, as created by add_permutes.py |
i_length | length of string in words |
i_author | records add_permutes.py as the author of the item |
i_osp_id | original source profile that the harvester string the item is based on comes from |
parse
Mirrors the item file in a [incr_tsdb()] profile with the addition of p_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id.
+----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------+------+-----+---------+----------------+
| p_parse_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| p_run_id | int(11) | YES | | -1 | |
| p_i_id | int(11) | NO | MUL | | |
| p_readings | int(11) | YES | | -1 | |
| p_first | int(11) | YES | | -1 | |
| p_total | int(11) | YES | | -1 | |
| p_tcpu | int(11) | YES | | -1 | |
| p_tgc | int(11) | YES | | -1 | |
| p_treal | int(11) | YES | | -1 | |
| p_words | int(11) | YES | | -1 | |
| p_l_stasks | int(11) | YES | | -1 | |
| p_ctasks | int(11) | YES | | -1 | |
| p_ftasks | int(11) | YES | | -1 | |
| p_etasks | int(11) | YES | | -1 | |
| p_stasks | int(11) | YES | | -1 | |
| p_aedges | int(11) | YES | | -1 | |
| p_pedges | int(11) | YES | | -1 | |
| p_raedges | int(11) | YES | | -1 | |
| p_rpedges | int(11) | YES | | -1 | |
| p_unifications | int(11) | YES | | -1 | |
| p_copies | int(11) | YES | | -1 | |
| p_conses | int(11) | YES | | -1 | |
| p_symbols | int(11) | YES | | -1 | |
| p_others | int(11) | YES | | -1 | |
| p_gcs | int(11) | YES | | -1 | |
| p_i_load | int(11) | YES | | -1 | |
| p_a_load | int(11) | YES | | -1 | |
| p_date | char(22) | YES | | | |
| p_error | varchar(1000) | YES | | | |
| p_comment | varchar(1000) | YES | | | |
| p_osp_id | int(11) | NO | | | |
+----------------+------------------+------+-----+---------+----------------+
p_parse_id | Primary key |
p_i_id | id of the input item this parse corresponds to. As of now they are always the same. |
p_osp_id | original source profile that the harvester string the item is based on comes from |
result
Mirrors the result file in the source profile with the addition of the r_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id and r_esp_id which is NULL
+--------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+----------------+
| r_parse_id | int(11) | NO | MUL | | |
| r_result_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| r_time | int(11) | YES | | -1 | |
| r_ctasks | int(11) | YES | | -1 | |
| r_ftasks | int(11) | YES | | -1 | |
| r_etasks | int(11) | YES | | -1 | |
| r_stasks | int(11) | YES | | -1 | |
| r_size | int(11) | YES | | -1 | |
| r_aedges | int(11) | YES | | -1 | |
| r_pedges | int(11) | YES | | -1 | |
| r_derivation | varchar(1000) | YES | | | |
| r_surface | varchar(1000) | YES | | | |
| r_tree | varchar(1000) | YES | | | |
| r_mrs | varchar(1000) | YES | MUL | | |
| r_flags | varchar(24) | YES | | | |
| r_wf | tinyint(4) | YES | | 1 | |
| r_osp_id | int(11) | NO | | | |
| r_esp_id | int(11) | YES | | NULL | |
+--------------+------------------+------+-----+---------+----------------+
r_parse_id | parse row this result corresponds to. As of now, the IDs are always the same. |
r_result_id | id for this result. As of now, the result, parse, and item IDs are always the same |
r_mrs | mrs tag corresponding to mrs_value in mrs |
r_wf | 1 if this result passes all universal filters, 0 if it fails at least one; filled in by stored procedure in MatrixTDB |
r_osp_id | original source profile that the harvester item the result is based on comes from |
r_esp_id | NULL. Originally intended to be most recent profile this has been exported to? |
Tables for language types and filters
These tables are used by sql_lg_type (to import language types), run_specific_filters (to run the master items through specific filters and record their pass/fail results), and generate_s_profile (to generate a validation [incr_tsdb()] profile).
feat_grp
This table stores groups of features referred to elsewhere. There are singleton groups (containing just one feature) and larger groups. The largest groups (whole language types) aren’t stored here, but rather in lt_feat_grp (see below). The non-singleton groups which are stored here are the conjunctions referred to by filters.
+-----------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+----------+------+-----+---------+----------------+
| fg_id | int(11) | NO | PRI | NULL | auto_increment |
| fg_grp_id | int(11) | YES | MUL | NULL | |
| fg_feat | char(20) | YES | MUL | NULL | |
| fg_value | char(20) | YES | | NULL | |
+-----------+----------+------+-----+---------+----------------+
fg_id | Primary key |
fg_grp_id | the ID of the feature group |
fg_feat | feature name |
fg_value | feature value |
filter
Filter repository whose purpose is to match the names of the filters in s_filters.py to the IDs used as foreign keys in fltr_feat_grp.
+-------------+-----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------+------+-----+---------+----------------+
| filter_id | int(11) | NO | PRI | NULL | auto_increment |
| filter_name | char(100) | YES | | NULL | |
| filter_type | char(2) | YES | | s | |
+-------------+-----------+------+-----+---------+----------------+
filter_id | Primary key |
filter_name | The name member of the filter in s_filters.py |
filter_type | Either ‘u’ for universal or ‘s’ for specific. Now we’re just using ‘s’ since add_permutes only enters those items that pass all universal filters and we’re skipping run_u_filters.py which used this table for ‘u’ filters. |
fltr_feat_grp
Intersection table that stores the feature groups that each specific filter cares about. These feature groups are the specification of the language types that the filter applies to. Used in combination with lt_feat_grp to get the filters thatapply to a language type
+-------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+----------------+
| ffg_id | int(11) | NO | PRI | NULL | auto_increment |
| ffg_fltr_id | int(11) | YES | MUL | NULL | |
| ffg_grp_id | int(11) | YES | MUL | NULL | |
+-------------+---------+------+-----+---------+----------------+
ffg_id | Primary key |
ffg_fltr_id | A foreign key to filter.filter_id |
ffg_grp_id | A foreign key to feat_grp.fg_grp_id |
fltr_mrs
NOTE: Not used in MatrixTDB2.
Each filter has some set of semantic equivalence classes that it applies to. This table seems to have been intended to record that information. Note that there can be multiple rows with for any given filter (i.e., with the same fm_fltr_id).
+------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+---------+----------------+
| fm_id | int(11) | NO | PRI | NULL | auto_increment |
| fm_fltr_id | int(11) | YES | | NULL | |
| fm_mrs_tag | char(20) | YES | | NULL | |
| fm_value | int(11) | YES | | NULL | |
+------------+----------+------+-----+---------+----------------+
fm_id | row id |
fm_fltr_id | filter id, points to filter |
fm_mrs_tag | mrs tag, indicating equivalence class |
fm_value | whether or not the filter applies? |
NOTE: There is no data in this table right now. The filters are defined as filter objects in the python code, and they each contain as list of mrs tags they apply to. This information could be stored in the DB, for reference, I supposed, but it isn’t currently.
lt
This table stores language types that we have worked with.
+------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+----------------+
| lt_id | int(11) | NO | PRI | NULL | auto_increment |
| lt_origin | char(2) | YES | | NULL | |
| lt_comment | varchar(1000) | YES | | NULL | |
+------------+---------------+------+-----+---------+----------------+
lt_id | Primary key |
lt_origin | Either ‘r’ or ‘p’. Indicates whether the language type was purpose-built or randomly-generated |
lt_comment | A user-entered comment about what the language type is |
lt_feat_grp
Intersection table that matches language types up with the feature groups they have. This table stores pointers from language types to the feature groups they comprise. On the one hand, it points to all the singleton group (feature-value pairs) that define the lt. On the other hand, it also points to the larger groups that are consistent with the lt definition. We update this table through create_or_update_lt() in sql_lg_types.py and update_all_lts_in_lfg() in run_specific_filters.py. We use it to find the filters that are appropriate for a given lt. Used in combination with fltr_feat_grp to find filters that are relevant to a language type.
+------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------+------+-----+---------+----------------+
| lfg_id | int(11) | NO | PRI | NULL | auto_increment |
| lfg_lt_id | int(11) | YES | MUL | NULL | |
| lfg_grp_id | int(11) | YES | MUL | NULL | |
+------------+---------+------+-----+---------+----------------+
lfg_id | Primary key |
lfg_lt_id | Foreign key to lt.lt_id |
lfg_grp_id | A foreign key to feat_grp.fg_grp_id |
One final note on this set of tables: If you are debugging the system and trying to figure out why a given string was not grammatical, you will want to know what specific filters it failed. The way to do this is to run the query that generate_s_profile runs when determine which strings fail filters specific to the relevant language type. This query, in a slightly different from from that used by genereate_s_profile, is:
- SELECT filter_name FROM filter
- INNER JOIN res_sfltr ON filter_id = rsf_sfltr_id
INNER JOIN fltr_feat_grp ON rsf_sfltr_id = ffg_fltr_id
- INNER JOIN lt_feat_grp ON ffg_grp_id = lfg_grp_id
WHERE rsf_value = 0
- AND lfg_lt_id = [lt_id] AND rsf_res_id = [result_id]
where lt_id is the lt_id of the language type you are generating the profile for and result_id is the id of the result you are wondering about. Be sure you find the ID that matches item_tsdb.i_input (the string) and r.r_mrs (the semantic tag) you’re wondering about. This will get you a list of all specific filters that the item failed. If the item didn’t fail any specific filters and still isn’t appearing as a grammatical item in your profile it probably failed a universal filter (in which case you won’t even have a result id for it.)
Tables for running filters
These tables are used by run_u_filters (no longer used) and run_specific_filters to record the results or running filters on string/mrs combos.
res_fltr
NOTE: No longer used since add_permutes just puts in those items that pass all universal filters.
This table records the results of each universal filter on each item (row from result, actually) that it applies to. Now rows are created in this table for filter/result combinations where the filter does not apply to that result.
Might get renamed to res_ufltr for clarity.
+------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------+------+-----+---------+----------------+
| rf_id | int(11) | NO | PRI | NULL | auto_increment |
| rf_res_id | int(11) | YES | MUL | NULL | |
| rf_fltr_id | int(11) | YES | | NULL | |
| rf_value | int(4) | YES | | NULL | |
+------------+---------+------+-----+---------+----------------+
rf_id | Primary key |
rf_res_id | Foreign key to result.r_result_id, parse.p_parse_id, and item_tsdb.i_id |
rf_fltr_id | Foreign key to filter.filter_id |
rf_value | The ‘result’ of applying the filter to the ‘result’. 0 for fail, 1 for pass, 2 for does-not-apply |
res_sfltr
Intersection table that stores the ‘result’ of applying a specific filter to the ‘result’ in a [incr_tsdb()] profile. Currently only used to store fails, not passes or does-not-applys.
+--------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+----------------+
| rsf_id | int(11) | NO | PRI | NULL | auto_increment |
| rsf_res_id | int(11) | YES | MUL | NULL | |
| rsf_sfltr_id | int(11) | YES | MUL | NULL | |
| rsf_value | int(4) | YES | | NULL | |
+--------------+---------+------+-----+---------+----------------+
rsf_id | row id |
rsf_res_id | Foreign key to result.r_result_id, parse.p_parse_id, and item_tsdb.i_id |
rsf_sfltr_id | Foreign key to filter.filter_id |
rsf_value | The ‘result’ of applying the filter to the ‘result’. 0 for fail, 1 for pass, 2 for does-not-apply |
Ancillary tables
Last update: 2009-09-17 by EmilyBender [edit]