Frequently Asked Questions

 
What is a datamart?

A datamart is a place where you can go on-line to obtain a variety of health related data, stratified or broken-down by a variety of demographic factors, such as age, race, gender, and/or geographic area. These data can be used for a variety of purposes such as research, community based health assessments, developing grant proposals,community benefit plans, data based decision making, or program evaluation.


What is the source of the data?

The Birth and Death Data are from the Vital Statistics Division of the Marion County Health Department, Indianapolis, Indiana. All Marion County resident birth and deaths are included in these data sets. The data are provided for both zipcode and census tract geographic areas. Out of county residents who deliver or die in one of Marion County's hospitals are not included in these datasets. The 1999 Death data are based on the new International Classification of Disease 10 (ICD-10) codes. Previous years used the ICD-9 codes. There are major differences between these two coding schemes. We have attempted to group the 1999 death data in such a manner that comparisons can be made with earlier years of data. Currently, there is no national guidance as to how ICD-10 codes should be grouped to make them completely compatible with ICD-9 codes. Therefore, the manner in which our 1999 death data are grouped is subject to change once guidance is provided by the National Center for Health Statistics. If you would like to see our disease groupings, please contact us at datamart@hhcorp.org and we would be happy to provide you a copy.

The Cancer Incidence data are from the Indiana State Cancer Registry. Indiana law requires all physicians, hospitals, and laboratories to report to the Indiana State Department of Health all newly identified and confirmed cases of cancer. This dataset includes all of the newly identified cancer cases in Marion County Residents.

The Indiana Health and Hospital Association and the Indiana State Department of Health provide the Hospital Discharge Data and Inpatient Hospital Procedures Data. Data on all hospital admissions resulting in a hospital stay of at least 24 hours to Marion County residents are included. The primary diagnosis is used to stratify the data. The primary procedures performed are included in the Hospital Inpatient Procedures section.

The Communicable Disease data are from the case reports from the Communicable Disease Control Program, the HIV/AIDS Program, the Immunization Program, the STD Control Program, and the Tuberculosis Control Program of the Marion County Health Department. These data are reported on a calendar year basis and not CDC reporting weeks.

The Indiana State Department of Health provides the Behavioral Risk Factor Surveillance System Data (BRFSS). The BRFSS program randomly contacts approximately 2,400 Indiana households and asks them questions regarding health activities and risk behaviors. The number of individuals called in a specific geographic area is based on the size of the underlying population. Since approximately 20 percent of the population of Indiana resides in Marion County, sufficient numbers of Marion County residents are called to permit the stratification and appropriate weighting of the data for only Marion County Residents. The BRFSS data are grouped by major risk factors. After selecting the risk factor you interested in, the actual question asked of study participants will appear. Select the study question on which you would like information and continue to determine how your table of information will be broken down, such as by age, race, or gender.

How do I use the Datamart?

The datamart is very easy to use. First identify the dataset you would like to query. Next select the key row and column factors you would like in your data table. Then select the year or years of data you would like to access, and select the geographic area on which you want information. Next highlight whether you want just frequencies, or frequencies and the type of percentages you would like, row or column, or whether you would like age adjusted rates (available in some datasets). Next just identify the risk factor or factors, or the cause of death that you would like to study. You can select more than one. Please remember the data are provided in several levels, beginning with the most general going to more specific data. Anytime a disease group appears in blue, there is additional information that may be obtained by drilling down.

How many cases have to be present to release the confidentiality control?

The confidentiality block is an attempt to help stabilize the data and prevent spurious associations due to great statistical variability in the underlying rates of disease. The statistical variability is due to small numbers of cases. Generally it takes at least 3 cases to release the confidentiality block. You can increase the number of cases for each outcome variable by combining multiple years of data, multiple zipcodes, and/or multiple census tracts.

How do I download the data?

Clicking on the download button will download the data. You can then rename the file to your liking and change the file extension to what you would like. The system automatically assigns a "cgi" extension, however we would recommend making the file a "txt" file. To do this,type in the file name you choose followed by .txt. If you give your file a "txt" extension, it can easily be imported into another program such as Word, Excel, Access, SAS or anyother software program you may be using.

How do I obtain data on multiple zipcodes or census tracts?

If you want data on sequential Zipcodes or census tracts just click on the first census tract or Zipcode, hold the left mouse button and drag the cursor to the last Zipcode or census tract on which you want to obtain information. If they are non-sequential Zipcodes or census tracts, just highlight the first Zipcode, hold down the control key, and highlight the remaining census tracts or Zipcodes on which you want to obtain information. Highlighting the census tract or zipcode field in the “row” fields will provide you information on each census tract or zipcode. Only 150 individual census tracts may be selected at one time. If you do not select the row parameter “Zipcode” or “Census tract” the program w ill automatically combine all of the data for the census tracts or zipcodes selected. A listing of the tracts or zipcodes will be provided below the data table.

How do I obtain data on Townships?

To aggregate the data into Townships you must use the databases aggregated by census tracts. For Pike Township use census tracts 3101.03 through 3103.06, Washington Township use census tracts 3201.05 through 3209.03, Lawrence Township use census tracts 3301.03 through 3310, Wayne Township use census tracts 3401.01 through 3426.00, Center Township use census tracts 3501 through 3581, Warren Township use census tracts 3601.01 through 3616, Decatur Township use census tracts 3701 through 3703, Perry Township use census tracts 3801 through 3812.05, Franklin Township use census tracts 3901 through 3904.

What is the difference between a frequency, percentage, and a rate?

A frequency is an actual count of the number of cases of a particular disease. A percentage and a rate are fractions, such that the numerator is included in the denominator, such as the percentage of males with cancer would be the number of males with cancer divided by the total number of persons with cancer. In Vital Statistics, rates are usually expressed per some convenient base such as per 1,000 or 100,000 persons.

What is age adjustment?

Many diseases are related to age. For example, individuals at certain ages may be more prone to develop cancer than others. Different populations or groups of individuals may vary by age, with some population groups being older, and others being younger. This makes it very difficult to ascertain whether differences in rates of disease are due to the differences in age or in the actual disease experience. Age adjustment is a way to statistically control for differing ages of persons in populations, thus making comparisons between populations possible. To do this we compared the age specific disease rates for each population to a standard population which then standardizes any effect due to age. The rates of disease in the two populations can now be compared with out having to worry about any effect due to age.

What year of data is used to age adjust the data?

Currently we are using the 1940 standard population to age adjust our data. However we will be switching to the new 2000 standard population in the near future.

Will I get data on each disease by specifying "All Diseases" in the outcome section and selecting "Cause" in the row field?

No you will only get a summation of all diseases. To get a listing of each disease, specify cause from the row key field list and highlight each specific disease that you are interested in. Remember for many diseases there are multiple levels of data.

What if I need data that are not in the datamart or have a special request?

If you need additional data or have a special request, please contact us at the e-mail address provided at the bottom of the screen.

Who do I contact if I have any questions?

Please feel free to drop us a line using the e-mail address provided at the bottom of the screen detailing any problems you may have had, any questions, or your suggestions for improving our datamart. We are always interested in your suggestions and we will respond as soon as possible.


Should you have any questions do not hesitate to email us at datamart@hhcorp.org