(Redirected from heterogeneous data)
- AKA: Non-Homogeneous Data.
- See: Complex Dataset, Domain-Dependent. Heterogeneous, Heterogeneous Data Analytics.
- (Fayyad, 1998) ⇒ Usama M. Fayyad. (1998). “Mining Databases: Towards Algorithms for Knowledge Discovery. In: IEEE Data Engineering Bulletin, 21.
- QUOTE: Often mining is desirable over non-homogenous data sets (including mixtures of multimedia, video, and text modalities); current methods assume fairly uniform and simple data structure.
- (Brin et al., 1997) ⇒ Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, and Shalom Tsur. (1997). “Dynamic Itemset Counting and Implication Rules for Market Basket Data.” In: Proceedings of the 1997 ACM SIGMOD International Conference on Management of data (SIGMOD 1997). doi:10.1145/253260.253325
- QUOTE: Non-homogeneous Data: One weakness of DIC is that it is sensitive to how homogeneous the data is. In particular, if the data is very correlated, we may not realize that an itemset is actually large until we have counted it in most of the database. If this happens, then we will not shift our hypothetical boundary and start counting some of the itemset's supersets until we have almost finished counting the itemset. As it turns out, the census data we used is ordered by census district and exactly this problem occurs. …