Breast cancer is a common, complex, and often deadly cancer caused by several known and unknown molecular alterations. Such changes may lead to abnormal cell proliferation, genetic variability and acquirement of a progressively invading and resistant phenotype. In women, the early detection of breast cancer is of paramount importance to be treated while still confined to the site of origin. This is quite challenging as malignant cells are heterogeneous and the host background is variable, so subgroups of molecularly distinct tumors that differ in phenotype and clinical outcomes are created. It is difficult for the conventional clinicopathological parameters always to select the most suitable diagnostic and therapeutic strategies. Furthermore, conventional cytotoxic drugs can distinguish cancerous cells from normal ones, leading to several adverse effects.
Since the recent past, scientists have been trying to identify critical diagnostic or prognostic factors to characterize the heterogeneity of the disease. They have identified a few essential genes, including ERBB2, TP53, CCND1 BRCA1, BRCA2 and HER2, responsible for the mammary glands' oncogenesis. Other markers detected in breast cancer patients comprise CA 15.3, CA 27.29, cathepsin D, cyclin E . Furthermore, targeting the hormonal receptors and ERBB2/HER2 receptors in breast cancer also allowed considerable therapeutic progress. A combination of such molecular markers is more sensitive and hence are more dependable for screening, diagnosis, prognosis, prediction of the intervention responses and searching for new therapeutic targets .
We know that cancer develops from sequential genetic variations that alter cellular courses such as growth regulation, senescence, programmed cell death, angiogenesis and metastasis. The initial search for genetic markers was conducted based on genomic and transcriptomic approaches, which presented many problems, the most significant being alternative splicing. This mechanism allows the generation of multiple proteins from a single mRNA; hence targeting a particular genome or transcriptome as a biomarker is problematic.
A dynamic and precise manifestation of both the inherent genetic program of the cell and the effect of its immediate environment can be analyzed by studying its proteome . Post-translational modifications of proteins (such as acetylation, phosphorylation and glycosylation) confer additional complexity to their structure; they cannot be detected at a transcriptome level but assists in improving protein stability, location, interactions and functions. Furthermore, proteins are more accessible and are better therapeutic targets with respect to nucleic acids. Thus, proteomics can offer the affiliation between gene sequence and cellular physiology and can accompaniment gene analysis for assessing disease progress, diagnosis and response to the intervention .
Clinical proteomics in breast cancer
The proteomic techniques may be grossly differentiated according to the need for an initial biological precognition. They include tissue microarrays capable of addressing the expression of candidate markers on an abundant array of tumor tissues. Other promising techniques based on the microarray, such as antibody arrays and reverse-phase protein arrays, can also be performed but are not often used in clinical applications. However, mass spectrometry tools are employed to classify exclusive or multiple protein markers correlating with an appropriate tumor phenotype. So, they are one of the techniques used for clinically diagnosing breast cancer.
Identifying potential molecular markers by high output techniques has paved the way for developing an approach that validates an extensive series of samples. Tissue microarray (TMA), developed in 1998, allows the concurrent analysis at DNA or protein levels of up to 1000 tumor samples arrayed in a single microscopic glass slide. A cylindrical core (Diameter = 600 µm) is brought forth from a formalin fixated paraffin-embedded (FFPE) archived tumor block arranged in a new paraffin block. This block is subsequently cut into thin sections of about 100-200 per block for investigation. All samples can simultaneously be interrogated by a specific antibody [as in immunohistochemistry (IHC)] and then analyzed morphologically.
TMAs have several advantages over IHC alone. It requires less time and labour, is more economical, reproducible and involves less complicated types of equipment. All samples in TMAs are tested under similar experimental conditions, and rigorous data collection is possible. TMA also allows the administration of tissue archives by appropriating tissue resources to study or other testing and optimization of diagnostic examinations.
TMAs are the most used proteomics technique in oncology, where the simultaneous profiling of many samples reinforces the statistical significance of results. The primary application of the technique involves examining individual molecular markers on a long series of samples. TMA can be used for screening various tumor types for the proteins of interest. They also analyze the phases of tumor development for particular cancer (e.g., cancer of the bladder and prostate). TMAs can be assembled from cell lines or other experimental materials, for example, xenografts.
However, several disadvantages or criticism have been formulated with regards to using this method in the cancer field. The most common concern among them was examining a small sample of potentially heterogeneous tumors with respect to the traditional large section. However, prior information about the protein to be analyzed does not promptly conduce to detecting novel markers over and above the necessity for bioinformatics tools for analyzing multi-dimensional quantitative data. The automated quantitative study of TMA has provided prompt, reproducible and impartial results. Hence, TMA is used for protein sample analysis for clinically diagnosing breast cancer.
Mining the serum proteome to identify new biomarkers is one of the main goals of many clinical proteomics efforts. Analytical methods such as MALDI-TOF MS and SELDI-TOF MS are promising approaches for biomarker discovery in proteomic technologies. These MS-based approaches exploit the properties of mass spectrometers that allow them to separate peptides or proteins based on their mass to charge ratio (or their m/z ratio), generating protein signatures that correlate with a given phenotype .
The above methods offer the stability for profiling numerous proteins in serum samples simultaneously and all over a varied range of molecular weights. Diverse comparative studies using this proteome-based strategy has enabled the identification of specific and substantial protein patterns in breast cancer.
Biomarker discovery can be made by sample types such as plasma. Still, the serum is the one that is most commonly used because the serum is a copious reservoir of molecules indicating systemic functions, and, unlike plasma, it is free from blood clotting factors. For routine linear profiling with human serum, both MALDI-TOF MS and SELDI-TOF MS yields similar results [4-6]. Both MALDI-TOF and SELDI-TOF MS protein profiling studies have to be performed accurately with adequate statistical analysis. Only then, in due course of time, the maximum potential of protein biomarkers for ameliorating cancer patient outcome will be accomplished .
Mass Spectrometry is generally used to distinguish proteins in a sample. It permits quantitative estimation of much unknown protein present in the sample without requiring any previous biological knowledge. However, this technique faces certain disadvantages of sensitivity, reproducibility, and difficulty accessing the low abundance proteome. Since this approach does not require any previous biological knowledge, it is the preferred method for identifying protein biomarkers.
In this method, the proteins that have been separated by 2D gel electrophoresis (or other methods) are ionized, imparted into the gas phase and evaluated based on their m/z ratio . The best sensitivity and resolution obtained lies in the mass range where m/z is less than 20000 Da, used in most profiling instruments . Peptide ionization is achieved through MALDI, resulting in a time-of-flight (TOF) distribution of the peptides in the mixture. The obtained peptide masses are then searched in sequence databases and identify the results with new proteins. The efficacy of this process depends upon the development of comprehensive sequence databases and expressed sequence tag databases .
Once peptide mass mapping cannot provide an exact match, MS/MS can be used for fragment ion measurements. MALDI-Q-TOF and MALDI-TOF-TOF are commercially available tandem MS that allows acquiring more encompassing fragmentation for one or more peptides to confirm the peptide mass map outcomes .
In MALDI-TOF-TOF, instrument settings may be optimized for a low mass (2000-20000 Da) and high mass (20000-100000 dalton). The spectra must be processed through the same baseline deduction procedure, and peaks are detected using a consistent definition of the required signal-to-noise ratio and mass window. Since a single MALDI analysis yields hundreds of protein peaks, identifying the biomarkers necessary may be tedious and error-prone. So, classification tools such as Fisher discriminative analysis and CART have been developed to make this process more convenient and accurate.
Upon detection by protein profiling, biomarker candidates may be put through TOF/TOF analysis to identify the peptides instantly from serum profiles by utilizing the same sample spot and/or re-spotting it. A preliminary study in the reflection mode allows visualizing the target or protein peak. The metastable fragment ions of the particular precursor ions can then be analyzed after another acceleration step. This subsequent fragment pattern can now be construed and applied for peptide identification via a database search .
Protein profiling using the above instrumentation is advantageous because it can sequence the target peptide right away. Besides, MALDI-TOF spectra have an improved resolution over those yielded by SELDI-TOF. This detection of good quality protein signatures and direct identification of protein markers by MALDI-MS makes it an ideal tool for serum proteome profiling in tissue biopsy and upfront examination of a tissue section. This technique has been helpful for the analysis of lung, brain and breast cancer samples .
A disadvantage of MALDI-TOF is that it usually requires some upfront fractionation of the serum to reduce the complexity of the sample, for example, by using magnetic beads in combination with prestructured sample supports (such as Anchor Chip technology) .
For MALDI-TOF to be carried out, it is essential to separate the protein ahead of identification. The general protein separation method, carried out in the laboratories, is 2-Dimensional Polyacrylamide Gel Electrophoresis (2-D PAGE), which allows separation of the protein on a large scale in a two-dimensional method that includes isoelectric focusing and gel electrophoresis. According to Bloom et al., nine breast cancers were analysed out of 77 adenocarcinomas from different primary sites . Nemeus et al. reported using 2D PAGE to identify the presence and absence of metastatic reversion in 20 patients after adjuvant treatment with methotrexate and fluorouracil. However, this method has certain disadvantages of low reproducibility and limited throughput capacity . Hence 2D differential Gel Electrophoresis (2D-diGE) was developed in which the protein extracts are labelled by fluorochrome cyanines (namely Cy2, Cy3 or Cy5), and then an equal quantity of samples to be used for comparison are combined and resolved by 2D PAGE . Then the migration pattern of the fluorochrome labelled protein is visualized under fluorescence imager using two specific wavelengths . However, alternatively, High Performance Liquid Chromatography (HPLC) can also be used for protein separation and quantification to identify differentially expressed peptides from estrogen receptors responsible for breast cancer .
For improvising the reproducibility and quantification of MS-based procedure, several labelling based approaches is developed which includes Isotope Coded Affinity Chromatography (ICAT), Stable Isotope Labelling with Amino acid in Cell culture (SILAC) and Isotope Tagging for Relative and Absolute Quantification (iTRAQ).
ICAT technology requires using chemical tags for protein tagging on cysteine residues with heavy Isotope label (C13) or light Isotope label (C12). However, ICAT reagents include a thiol specific reactive group, an ethylene glycol linker group and a biotin tag. The protein sample containing the cysteine side chain are labelled in two distinct cell states using heavy ICAT reagent for one and light ICAT reagent for the other. The sample is then mixed and digested using certain proteolytic enzymes. Then the affinity chromatography with the streptavidin column is carried out such that the protein-containing biotin tag gets bound to the streptavidin column. The sample is then subjected to Mass Spectrometry. However, certain disadvantages were also found, including detecting any traceable cysteine-containing peptides and may escape peptides with post-translational modifications.
In the iTRAQ method, sample proteins are labelled on lysine residues. The N- terminus with cleavable iTRAQ reagents produces mass spectrometry signature ions showing relative peaks corresponding to the proportion of the labelled peptides. However, this approach resembles ICAT with the advantages of appropriating four varying samples in one spectrometry run. This significant significantly reduces the cost. The iTRAQ reagents comprise a reporter and balance group. The disadvantage of this method is that it is a time-consuming process.
SILAC is among the most extensively used Proteomics methods because it is based on in vivo labelling of entire cellular proteomes for quantification by MS. In the SILAC method, the cells are grown in a culture medium containing isotopic amino acids like lysine or arginine. The integration of heavy amino acids occurs through cell growth. However, incorporating these heavy amino acids provides more excellent protein coverage that enhances the assurance of identification. Arginine and lysine are the choices of isotope because trypsin cleaves after these residues that permit each peptide terminating with arginine or lysine to be quantified and compared. This technology has been used successfully to quantify comparative protein abundance.
SELDI-TOF MS can provide vast volumes of low molecular weight protein expression data and permits faster tumor protein pattern analysis. In this method, solid aluminium or stainless-steel chips are engineered with bait surfaces (1-2 mm) that are either anionic chromatographic supports (hydrophobic, cationic) or affinity supports (antibodies, purified receptors or ligand proteins, DNA oligonucleotides). The bait surface directly received a small amount (in µl) of solubilized tissue or serum and washed to eliminate unbound proteins. Only those proteins precisely bound with the bait surface remain, subsequently analyzed by mass spectrometry .
A large number of protein species (up to 2000) can be detected from serum by this method. Univariate or multivariate statistical tools may analyze the resultant spectral masses to produce a single marker or multi-marker pattern capable of analyzing clinical samples. Subsequently, discriminatory protein peaks are purified and identified.
SELDI analysis software exhibits data either as a typical mass chromatography or as a gel-like density graph. Following data collection, each spectrum must be tuned for mass by using the correct peptide calibration. Like in MALDI-TOF, all spectra should be handled using the same baseline deduction procedure and normalized by total ion current (as in Ciphergen software). While performing peak detection, a consistent definition of obligatory signal-to-noise ratio (usually 3) and mass window (usually 0.2-0.3%) must be maintained .
The SELDI technique had been developed for profiling clinical, biological fluids (such as serum). Several studies have shown promising potential in classifying exclusive biomarkers or complex patterns with diagnostic value, thus is used as a screening or initial diagnostic tool for the proper detection of breast cancer.
SELDI instrumentation does not require extensive sample preparation as is needed for MALDI-TOF, and the protein profiles can be obtained within minutes. It is easy to use, has high throughput and is relatively affordable, making it suitable for working with large sample groups in a clinical set-up. However, this technique is not very sensitive and the results obtained are not always reproducible .
Diagnostic protein profiling
The best way to reduce breast cancer includes its prevention and detection at an early stage. However, most of MALDI/SELDI protein profiling analysis of breast cancer has been performed to search for novel diagnostic markers. All these diagnostic protein profiling analyses have been executed in vivo, involving the investigation of various biological matrices.
Protein profiling of tissue
The earliest change leading to breast cancer occurs in the tissue proteins caused by successive genetic mutations. It has been hypothesized that the tissue protein provides the highest number of biomarkers. Analysis of tissue tumors lysate by SELDI-TOF MS has brought out numerous peaks that were remarkedly related to cancer subtypes. Laser capture microdissection (LCM) enables a specific subset of cells to be selectively captured. Umar et al. used LCM and detected nine differential tryptic peptides, following the analysis of stromal and tumor cells amassed from 5 tissue samples . Sanders et al. identified ubiquitin and 5100A8 diminished in tumor tissues compared to normal tissue . Over time, other sophisticated methods came into use for detecting tissue protein anomalies in malignant cases. Einaga et al. used formalin fixed paraffin-embedded (FFPE) tissue specimens combined with histopathology to use LCM on target cells . Protein biomarker assessment of solid tumors is predominantly done by immunohistochemistry (IHC) . IHC allows next-generation sequencing (NGS) to analyze the genetic events that occur during cancer precisely. The real-time polymerase chain reaction may be used to detect such apparent mutations detected by NGS . Parra et al. reviewed several multiplexed methodologies and image analysis to improve traditional methods like IHC . The former was capable of determining tissue microenvironments by studying one or more tissue samples. These techniques were more efficient in studying disease diagnosis and prevention and shed light on immune cell co-expression and their spatial pattern distribution in the tumor microenvironment . Yu et al. developed the first immunoassay for absolute quantification of HER2 levels in FFPE samples by quantitative dot blot (QDB) method capable of measuring HER2 protein levels in FFPE breast cancer tissues . This change in protein helps understand the pathogenesis of breast cancer, provided one circumvents the complexities associated with tissue sampling.
Profiling of plasma and serum proteins
Since blood is conceived to offer a dynamic reflection of physiological and pathological status, human blood plasma and serum represents the most widely studied matrices for breast cancer biomarkers. The blood serum and plasma also contain specific tumor secreted proteases and proteins formed by the local and remote responses to the cancer. In addition, whole blood is easy to sample as blood is a readily available matrix that permits the recurrent collection, promoting the medical significance of prospective blood-borne markers.
Numerous MALDI-TOF MS and SELDI-TOF MS peaks have been described to distinguish between breast cancer patients' plasma and healthy control. Becker et al. used SELDI-TOF to obtain peaks that were significantly more efficient in expression among the breast cancer patient with or without BRCA mutation . Early detection of disease can be made possible by using serum proteome profiling studies. Specific biomarkers in the serum of cancer patients are elevated or depleted compared to normal, healthy individuals . The biomarkers like CA 15.3 are overexpressed in carcinogenic breast cells and, therefore, can be used to diagnose breast cancer. Another biomarker includes CA 27.29, which can also be used as a biomarker but is less specific than CA 15.3 . Several other biomarkers are also detected to date. For example, Lee et al. developed a plasma protein signature for breast cancer detection using mass spectrometry-based on multiple reaction monitoring . They found 11 proteins to exhibit significantly differential expression in plasma during malignant tumorigenesis. Of these, three proteins (neural cell adhesion molecule L-1 like protein, apolipoprotein C-1 and carbonic anhydrase-1) gave consistent, statistically valid outcomes for patients with type I and type II breast cancers. This three-protein model diagnosed breast cancer in asymptomatic women and allowed effective estimation of plasma protein without using antibodies .
Some other critical applications of plasma and serum in cancer detection have also been developed, including monitoring miRNA levels; higher miRNA levels in breast cancer patients correlate with poor prognosis [20, 21]. Liquid biopsy (analysis of peripheral blood samples), an example of precision oncology, may also be used to circumvent the problem of repeated tissue sampling, thus providing a more attractive alternative . This diagnostic tool is minimally invasive, and it could overcome the limitations of surgical biopsy. Early diagnosis is made possible by analyzing the circulating tumor cells, circulating tumor DNA and extracellular vesicles such as exosomes. Of these, the exosomes mirror the biological footprints of paternal cells from which they originate, thus serving as highly promising predictors of early cancer diagnostics and treatment response .
Profiling of saliva proteins
Using saliva in protein profiling has several advantages, including non-invasive sample collection, the likelihood of repetitive sampling, the ease of sample management. The use of saliva in protein profiling of breast cancer has been shown by detecting increased solubilized cERBB2 and cAI53 in breast cancer patients with respect to healthy individuals. Using SELDI-TOF MS, five high molecular weight peaks were observed to be overexpressed in breast cancer patients with respect to control patients . With the help of enzyme-linked immunosorbent assay (ELISA), it determined that the levels of EGF, VEGF, and CEA markers in breast cancer patients' saliva were enhanced compared to normal, healthy individuals . Liu et al. assessed the changes in salivary glycopatterns after a thorough investigation using lectin microarray probes . The salivary glycosylation pattern in breast cancer patients was altered in healthy patients, and changes in the salivary glycopatterns may allow the detection of patients with early-stage breast cancer . Streckfus et al. detected, using chromogenic tripeptide assay, that the concentration of kallikrein was increased in patients with malignant tumors, and its use as a biomarker was confirmed . The concentration of EGF also increased in such cases, and was used as a marker for post-surgical examination of diagnosed cancer patients . Sawczuk et al. determined if BRCA1 mutations affect salivary redox profile by evaluating the secretory functions of salivary glands, biomarkers of redox balance, and oxidative damage to proteins and lipids in subjects' saliva BRCA1 mutation . They found people with this mutation were predisposed to early salivary gland dysfunction and caused oxidative damage to salivary proteins and lipids. Hence, for techniques such as cluster analysis, proteins like salivary peroxidase may be considered biomarkers.
Proteomics of breast cancer has a bright future ahead of it. It allows the classification of targets for definitive therapy. Also, proteins are good markers. Specific protein isoforms can be expressed by tumors and other patient fluids (such as serum), enabling early detection. Proteomics may also be employed to cellular components, such as the nucleus, cell membrane and organelles, instead of the other methods. Furthermore, proteomics may be effortlessly conjugated with functional assessments, such as antibody arrays . Recently, the foremost monoclonal antibody-based microarrays have been used to study breast cancer cell lines. This identified IL-8 as a provisory crucial factor in breast cancer incursion and progression . Most MALDI and SELDI protein profiling studies in breast cancer search for new analytical markers, whereas the exploration of novel predictive biomarkers is confined only to limited analysis.
Certain areas of proteomic diagnosis are still not adequately understood. Several identified prospective breast cancer markers have been discovered to have diagnostic potency for different types of cancer, for example, the apolipoprotein A-1 in ovarian cancer. This indicates an ecumenical absence of tumor specificity. In addition, the identified candidate markers sometimes constitute normal cellular proteins and an abundance of blood proteins regarded in coagulation and the acute phase reaction. Since their biology cannot be straightaway connected to tumor biochemistry, one among the final objectives of protein profiling studies (accumulating information about the molecular mechanisms responsible for cancer by identifying the discriminating proteins produced exclusively by cancer cells) remains to be fulfilled .
In a recent study, proteogenomics is considered one of the most promising tissue proteolytic profiling methods of breast cancer, where complementation of genome is done along with proteome profiling . Mertin and coworkers have done proteogenomics to identify therapeutic targets for breast cancer treatment . However, further studies on proteogenomics are necessary for a better understanding of this technology .
The application of proteomics in breast cancer diagnosis has provided more structure and logic for managing this once-feared disease so that one is not rendered as helpless by it as they were in earlier times. Characterizing the discriminator proteins will undoubtedly offer new indicators suitable for screening, diagnosis, prognosis and management of breast cancer. These days, oncologists and pathologists can routinely use the proteomic tools mentioned in this review to design customized intervention approaches based on the molecular profiles of discrete tumors, monitor treatment progress, and detect any hint of toxic side effects. They help and continue doing so in evolving newer molecularly directed anticancer drugs, which will hopefully improve patient life quality and expectancy.