What is a datamart?
A datamart is a place where you can go on-line to obtain a variety of health related data,
stratified or broken-down by a variety of demographic factors, such as age, race, gender,
and/or geographic area. These data can be used for a variety of purposes such as research,
community based health assessments, developing grant proposals,community benefit plans, data
based decision making, or program evaluation.
What is the source of the data?
The Birth and Death Data are from the Vital Statistics Division of the Marion County
Health Department, Indianapolis, Indiana. All Marion County resident birth and deaths
are included in these data sets. The data are provided for both zipcode and census tract
geographic areas. Out of county residents who deliver or die in one of Marion County's
hospitals are not included in these datasets.
The 1999 Death data are based on the new
International Classification of Disease 10 (ICD-10) codes. Previous years used the
ICD-9 codes. There are major differences between these two coding schemes. We have
attempted to group the 1999 death data in such a manner that comparisons can be made
with earlier years of data. Currently, there is no national guidance as to how ICD-10
codes should be grouped to make them completely compatible with ICD-9 codes. Therefore,
the manner in which our 1999 death data are grouped is subject to change once guidance is
provided by the National Center for Health Statistics. If you would like to see our disease
groupings, please contact us at datamart@hhcorp.org and we would be happy to provide you a copy.
The Cancer Incidence data are from the Indiana State Cancer Registry. Indiana law
requires all physicians, hospitals, and laboratories to report to the Indiana State
Department of Health all newly identified and confirmed cases of cancer. This dataset
includes all of the newly identified cancer cases in Marion County Residents.
The Indiana Health and Hospital Association and the Indiana State Department of Health
provide the Hospital Discharge Data and Inpatient Hospital Procedures Data. Data
on all hospital admissions resulting in a hospital stay of at least 24 hours to Marion
County residents are included. The primary diagnosis is used to stratify the data. The
primary procedures performed are included in the Hospital Inpatient Procedures section.
The Communicable Disease data are from the case reports from the Communicable
Disease Control Program, the HIV/AIDS Program, the Immunization Program, the STD
Control Program, and the Tuberculosis Control Program of the Marion County Health
Department. These data are reported on a calendar year basis and not CDC reporting
weeks.
The Indiana State Department of Health provides the Behavioral Risk Factor
Surveillance System Data (BRFSS). The BRFSS program randomly contacts
approximately 2,400 Indiana households and asks them questions regarding health
activities and risk behaviors. The number of individuals called in a specific geographic
area is based on the size of the underlying population. Since approximately 20 percent of
the population of Indiana resides in Marion County, sufficient numbers of Marion County
residents are called to permit the stratification and appropriate weighting of the data for
only Marion County Residents. The BRFSS data are grouped by major risk factors. After
selecting the risk factor you interested in, the actual question asked of study participants
will appear. Select the study question on which you would like information and continue to
determine how your table of information will be broken down, such as by age, race, or gender.
How do I use the Datamart?
The datamart is very easy to use. First identify the dataset you would like to query. Next
select the key row and column factors you would like in your data table. Then select the
year or years of data you would like to access, and select the geographic area on which
you want information. Next highlight whether you want just frequencies, or frequencies
and the type of percentages you would like, row or column, or whether you would like
age adjusted rates (available in some datasets). Next just identify the risk factor or
factors, or the cause of death that you would like to study. You can select more than one.
Please remember the data are provided in several levels, beginning with the most general
going to more specific data. Anytime a disease group appears in blue, there is additional
information that may be obtained by drilling down.
How many cases have to be present to release the confidentiality control?
The confidentiality block is an attempt to help stabilize the data and prevent spurious
associations due to great statistical variability in the underlying rates of disease.
The statistical variability is due to small numbers of cases. Generally it takes at
least 3 cases to release the confidentiality block. You can increase the number of cases for
each outcome variable by combining multiple years of data, multiple zipcodes, and/or multiple
census tracts.
How do I download the data?
Clicking on the download button will download the data. You can then rename the file to
your liking and change the file extension to what you would like. The system
automatically assigns a "cgi" extension, however we would recommend making the file a
"txt" file. To do this,type in the file name you choose followed by .txt. If you give your
file a "txt" extension, it can easily be imported into another program such as Word, Excel,
Access, SAS or anyother software program you may be using.
How do I obtain data on multiple zipcodes or census tracts?
If you want data on sequential Zipcodes or census tracts just click on the first census tract
or Zipcode, hold the left mouse button and drag the cursor to the last Zipcode or census
tract on which you want to obtain information. If they are non-sequential Zipcodes or
census tracts, just highlight the first Zipcode, hold down the control key, and highlight
the remaining census tracts or Zipcodes on which you want to obtain information.
Highlighting the census tract or zipcode field in the “row” fields will provide you
information on each census tract or zipcode. Only 150 individual census tracts may be selected
at one time. If you do not select the row parameter “Zipcode” or “Census tract” the program w
ill automatically combine all of the data for the census tracts or zipcodes selected. A listing of
the tracts or zipcodes will be provided below the data table.
How do I obtain data on Townships?
To aggregate the data into Townships you must use the databases aggregated by census
tracts. For
Pike Township use census tracts 3101.03 through 3103.06,
Washington Township use census tracts 3201.05 through 3209.03,
Lawrence Township use census tracts 3301.03 through 3310,
Wayne Township use census tracts 3401.01 through 3426.00,
Center Township use census tracts 3501 through 3581,
Warren Township use census tracts 3601.01 through 3616,
Decatur Township use census tracts 3701 through 3703,
Perry Township use census tracts 3801 through 3812.05,
Franklin Township use census tracts 3901 through 3904.
What is the difference between a frequency, percentage, and a rate?
A frequency is an actual count of the number of cases of a particular disease. A
percentage and a rate are fractions, such that the numerator is included in the denominator,
such as the percentage of males with cancer would be the number of males with cancer divided
by the total number of persons with cancer. In Vital Statistics, rates are usually expressed
per some convenient base such as per 1,000 or 100,000 persons.
What is age adjustment?
Many diseases are related to age. For example, individuals at certain ages may be
more prone to develop cancer than others. Different populations or groups of individuals
may vary by age, with some population groups being older, and others being younger. This makes it
very difficult to ascertain whether differences in rates of disease are due to the differences
in age or in the actual disease experience. Age adjustment is a way to statistically
control for differing ages of persons in populations, thus making comparisons between
populations possible. To do this we compared the age specific disease rates for each population
to a standard population which then standardizes any effect due to age. The rates of disease in
the two populations can now be compared with out having to worry about any effect due to age.
What year of data is used to age adjust the data?
Currently we are using the 1940 standard population to age adjust our data. However we
will be switching to the new 2000 standard population in the near future.
Will I get data on each disease by specifying "All Diseases" in the outcome section
and selecting "Cause" in the row field?
No you will only get a summation of all diseases. To get a listing of each disease, specify
cause from the row key field list and highlight each specific disease that you are
interested in. Remember for many diseases there are multiple levels of data.
What if I need data that are not in the datamart or have a special request?
If you need additional data or have a special request, please contact us at the e-mail
address provided at the bottom of the screen.
Who do I contact if I have any questions?
Please feel free to drop us a line using the e-mail address provided at the bottom of the
screen detailing any problems you may have had, any questions, or
your suggestions for improving our datamart. We are always interested in your
suggestions and we will respond as soon as possible.
|